Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)? - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)? |
Date | |
Msg-id | CAA4eK1+di2+Nc8-=jZqdDLPzvYw-S-QSDcG6e1nOU3f9-07M0Q@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)? (Alexander Korotkov <a.korotkov@postgrespro.ru>) |
Responses |
Re: [HACKERS] Challenges preventing us moving to 64 bit transactionid (XID)?
|
List | pgsql-hackers |
On Thu, Jun 22, 2017 at 9:06 PM, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote: > On Wed, Jun 7, 2017 at 11:33 AM, Alexander Korotkov > <a.korotkov@postgrespro.ru> wrote: >> >> On Tue, Jun 6, 2017 at 4:05 PM, Peter Eisentraut >> <peter.eisentraut@2ndquadrant.com> wrote: >>> >>> On 6/6/17 08:29, Bruce Momjian wrote: >>> > On Tue, Jun 6, 2017 at 06:00:54PM +0800, Craig Ringer wrote: >>> >> Tom's point is, I think, that we'll want to stay pg_upgrade >>> >> compatible. So when we see a pg10 tuple and want to add a new page >>> >> with a new page header that has an epoch, but the whole page is full >>> >> so there isn't 32 bits left to move tuples "down" the page, what do we >>> >> do? >>> > >>> > I guess I am missing something. If you see an old page version number, >>> > you know none of the tuples are from running transactions so you can >>> > just freeze them all, after consulting the pg_clog. What am I missing? >>> > If the page is full, why are you trying to add to the page? >>> >>> The problem is if you want to delete from such a page. Then you need to >>> update the tuple's xmax and stick the new xid epoch somewhere. >>> >>> We had an unconference session at PGCon about this. These issues were >>> all discussed and some ideas were thrown around. We can expect a patch >>> to appear soon, I think. >> >> >> Right. I'm now working on splitting my large patch for 64-bit xids into >> patchset. >> I'm planning to post patchset in the beginning of next week. > > > Work on this patch took longer than I expected. It is still in not so good > shape, but I decided to publish it anyway in order to not stop progress in > this area. > I also tried to split this patch into several. But actually I manage to > separate few small pieces, while most of changes are remaining in the single > big diff. > Long story short, patchset is attached. > > 0001-64bit-guc-relopt-1.patch > This patch implements 64 bit GUCs and relation options which are used in > further patches. > > 0002-heap-page-special-1.patch > Putting xid and multixact bases into PageHeaderData would take extra 16 > bytes on index pages too. That would be waste of space for indexes. This > is why I decided to put bases into special area of heap pages. > This patch adds special area for heap pages contaning prune xid and magic > number. Magic number is different for regular heap page and sequence page. > uint16 pd_pagesize_version; - TransactionId pd_prune_xid; /* oldest prunable XID, or zero if none */ ItemIdData pd_linp[FLEXIBLE_ARRAY_MEMBER]; /* linepointer array */ } PageHeaderData; Why have you moved pd_prune_xid from page header? > 0003-64bit-xid-1.patch > It's the major patch. It redefines TransactionID ad 64-bit integer and > defines 32-bit ShortTransactionID which is used for t_xmin and t_xmax. > Transaction id comparison becomes straight instead of circular. Base values > for xids and multixact ids are stored in heap page special. SLRUs also > became 64-bit and non-circular. To be able to calculate xmin/xmax without > accessing heap page, base values are copied into HeapTuple. Correspondingly > HeapTupleHeader(Get|Set)(Xmin|Xmax) becomes just > HeapTuple(Get|Set)(Xmin|Xmax) whose require HeapTuple not just > HeapTupleHeader. heap_page_prepare_for_xid() is used to ensure that given > xid fits particular page base. If it doesn't fit then base of page is > shifted, that could require single-page freeze. Format for wal is changed > in order to prevent unaligned access to TransactionId. *_age GUCs and > relation options are changed to 64-bit. Forced "autovacuum to prevent > wraparound" is removed, but there is still freeze to truncate SLRUs. > It seems there is no README or some detailed explanation of how all this works like how the value of pd_xid_base is maintained. I don't think there are enough comments in the patch to explain the things. I think it will be easier to understand and review the patch if you provide some more details either in email or in the patch. > 0004-base-values-for-testing-1.patch > This patch is used for testing that calculations using 64-bit bases and > short 32-bit xid values are correct. It provides initdb options for initial > xid, multixact id and multixact offset values. Regression tests initialize > cluster with large (more than 2^32) values. > > There are a lot of open items, but I would like to notice some of them: > * WAL becomes significantly larger due to storage 8 byte xids instead of 4 > byte xids. Probably, its needed to use base approach in WAL too. > Yeah and I think it can impact performance as well. By any chance have you run pgbench read-write to see the performance impact of this patch? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
pgsql-hackers by date: