From 6104a112d52ccc08466b9c57a287d737150ff194 Mon Sep 17 00:00:00 2001 From: Peter Geoghegan Date: Sat, 22 Apr 2023 12:41:00 -0700 Subject: [PATCH v2 5/9] Move Interpreting XID stamps from tuple headers. This is intended to be fairly close to a mechanical change. It isn't entirely mechanical, though, since the original wording has been slightly modified for it to work in context. Structuring things this way should make life a little easier for doc translators. --- doc/src/sgml/maintenance.sgml | 81 +++++++---------------------------- doc/src/sgml/storage.sgml | 62 +++++++++++++++++++++++++++ 2 files changed, 78 insertions(+), 65 deletions(-) diff --git a/doc/src/sgml/maintenance.sgml b/doc/src/sgml/maintenance.sgml index 62e22d861..f554e12bf 100644 --- a/doc/src/sgml/maintenance.sgml +++ b/doc/src/sgml/maintenance.sgml @@ -447,75 +447,26 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu wraparound - - wraparound - of transaction IDs - + + wraparound + of transaction IDs + - PostgreSQL's - MVCC transaction semantics - depend on being able to compare transaction ID (XID) - numbers: a row version with an insertion XID greater than the current - transaction's XID is in the future and should not be visible - to the current transaction. But since transaction IDs have limited size - (32 bits) a cluster that runs for a long time (more - than 4 billion transactions) would suffer transaction ID - wraparound: the XID counter wraps around to zero, and all of a sudden - transactions that were in the past appear to be in the future — which - means their output become invisible. In short, catastrophic data loss. - (Actually the data is still there, but that's cold comfort if you cannot - get at it.) To avoid this, it is necessary to vacuum every table - in every database at least once every two billion transactions. + PostgreSQL's MVCC transaction semantics depend on + being able to compare transaction + ID numbers (XID) to determine + whether or not the row is visible to each query's MVCC snapshot + (see + interpreting XID stamps from tuple headers). But since + on-disk storage of transaction IDs in heap pages uses a truncated + 32-bit representation to save space (rather than the full 64-bit + representation), it is necessary to vacuum every table in every + database at least once every two billion + transactions (though far more frequent vacuuming is typical). - - The reason that periodic vacuuming solves the problem is that - VACUUM will mark rows as frozen, indicating that - they were inserted by a transaction that committed sufficiently far in - the past that the effects of the inserting transaction are certain to be - visible to all current and future transactions. - Normal XIDs are - compared using modulo-232 arithmetic. This means - that for every normal XID, there are two billion XIDs that are - older and two billion that are newer; another - way to say it is that the normal XID space is circular with no - endpoint. Therefore, once a row version has been created with a particular - normal XID, the row version will appear to be in the past for - the next two billion transactions, no matter which normal XID we are - talking about. If the row version still exists after more than two billion - transactions, it will suddenly appear to be in the future. To - prevent this, PostgreSQL reserves a special XID, - FrozenTransactionId, which does not follow the normal XID - comparison rules and is always considered older - than every normal XID. - Frozen row versions are treated as if the inserting XID were - FrozenTransactionId, so that they will appear to be - in the past to all normal transactions regardless of wraparound - issues, and so such row versions will be valid until deleted, no matter - how long that is. - - - - - In PostgreSQL versions before 9.4, freezing was - implemented by actually replacing a row's insertion XID - with FrozenTransactionId, which was visible in the - row's xmin system column. Newer versions just set a flag - bit, preserving the row's original xmin for possible - forensic use. However, rows with xmin equal - to FrozenTransactionId (2) may still be found - in databases pg_upgrade'd from pre-9.4 versions. - - - Also, system catalogs may contain rows with xmin equal - to BootstrapTransactionId (1), indicating that they were - inserted during the first phase of initdb. - Like FrozenTransactionId, this special XID is treated as - older than every normal XID. - - - controls how old an XID value has to be before rows bearing that XID will be diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index e5b9f3f1f..f31a002fc 100644 --- a/doc/src/sgml/storage.sgml +++ b/doc/src/sgml/storage.sgml @@ -1072,6 +1072,68 @@ data. Empty in ordinary tables. it might be compressed, too (see ). + + + Interpreting XID stamps from tuple headers + + + The on-disk representation of transaction IDs (the representation + used in t_xmin and + t_xmax fields) use a truncated 32-bit + representation of transaction IDs, not the full 64-bit + representation. This is not suitable for long term storage without + special processing by VACUUM. + + + + VACUUM will + mark tuple headers frozen, indicating + that all eligible rows on the page were inserted by a transaction + that committed sufficiently far in the past that the effects of the + inserting transaction are certain to be visible to all current and + future transactions. Normal XIDs are compared using + modulo-232 arithmetic. This means that + for every normal XID, there are two billion XIDs that are + older and two billion that are newer; + another way to say it is that the normal XID space is circular with + no endpoint. Therefore, once a row version has been created with a + particular normal XID, the row version will appear to be in + the past for the next two billion transactions, no matter + which normal XID we are talking about. If the row version still + exists after more than two billion transactions, it will suddenly + appear to be in the future. To prevent this, + PostgreSQL reserves a special XID, + FrozenTransactionId, which does not follow the + normal XID comparison rules and is always considered older than + every normal XID. Frozen row versions are treated as if the + inserting XID were FrozenTransactionId, so that + they will appear to be in the past to all normal + transactions regardless of wraparound issues, and so such row + versions will be valid until deleted, no matter how long that is. + + + + + In PostgreSQL versions before 9.4, freezing was + implemented by actually replacing a row's insertion XID + with FrozenTransactionId, which was visible in the + row's xmin system column. Newer versions just set a flag + bit, preserving the row's original xmin for possible + forensic use. However, rows with xmin equal + to FrozenTransactionId (2) may still be found + in databases pg_upgrade'd from pre-9.4 versions. + + + Also, system catalogs may contain rows with xmin equal + to BootstrapTransactionId (1), indicating that they were + inserted during the first phase of initdb. + Like FrozenTransactionId, this special XID is treated as + older than every normal XID. + + + + + -- 2.40.0