Thread: questions about wraparound
Hi all, I'm doing some experiments on a cluster to see what happens at xid wraparound, and I'm approaching it. There is no activity in any database but testdb. Scenario: I've a procedure that is consuming all xids, while another connection is inserting a tuple every 20 minutes or so in a table, just to prevent autovacuum to freeze in emergency. autovacuum is globally turned off. Therefore, I'm approaching wraparound: testdb=> select datname, datfrozenxid, age( datfrozenxid ), txid_current() from pg_database;WARNING: database "postgres" must be vacuumed within 7989757 transactions HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. datname | datfrozenxid | age | txid_current -----------+--------------+------------+-------------- postgres | 1221679879 | 2139493890 | 11951108361 backupdb | 1221679879 | 2139493890 | 11951108361 template1 | 1221679879 | 2139493890 | 11951108361 template0 | 1221679879 | 2139493890 | 11951108361 testdb | 1221679879 | 2139493890 | 11951108361 pgbench | 1221679879 | 2139493890 | 11951108361 (6 rows) What puzzles me is that I'm somehow "locking" the testdb.wa table (by inserting a tuple every 20 minutes), so all other tables and databases are free to be frozen by an emergency autovacuum. And I was expecting the problem to happen due to the testdb.wa table, and therefore the hint message to be related to "testdb", not "postgres" database. Digging I found that all the database are becoming old, and all the tables in every database has the same age. Therefore my question is: shouldn't autovacuum be able to freeze other tables/databases? I mean, the wraparound problem in this scenario will cause problems, but I was expecting different numbers for different tables/databases. The other question is: the xid is defined as a 32 bit integer: typedef uint32 TransactionId; but it is exposed as a 64 bit integer typedef uint64 txid; appending EpochFromFullTransactionId (that I'm not able to find in the sources). The dumb question then is: why use the 32 bit machinery if the txid is exposed as 64 bit wide value? Thanks, Luca
On Thu, Mar 18, 2021 at 9:56 AM Luca Ferrari <fluca1978@gmail.com> wrote: > > What puzzles me is that I'm somehow "locking" the testdb.wa table (by > inserting a tuple every 20 minutes), so all other tables and databases > are free to be frozen by an emergency autovacuum. And I was expecting > the problem to happen due to the testdb.wa table, and therefore the > hint message to be related to "testdb", not "postgres" database. > Digging I found that all the database are becoming old, and all the > tables in every database has the same age. > Therefore my question is: shouldn't autovacuum be able to freeze other > tables/databases? I mean, the wraparound problem in this scenario will > cause problems, but I was expecting different numbers for different > tables/databases. And I hit the 1 million transaction theshold, so I tunrned off the cluster, gone into single user mode and vacuumed the database postgres (oid = 13811). Then the system asked me to vacuum another database, on which I've not generated traffic. % sudo -u postgres postgres --single -D /postgres/12/data postgres WARNING: database with OID 13811 must be vacuumed within 1000000 transactions HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database. PostgreSQL stand-alone backend 12.5 backend> VACUUM VERBOSE WARNING: database "backupdb" must be vacuumed within 1000000 transactions Instead I restarted the multiuser postgres, and I was able to connect to the problematic database testdb and issue read-write transacctions. And now the situation is: testdb=> select datname, datfrozenxid, age(datfrozenxid) from pg_database; datname | datfrozenxid | age -----------+--------------+---------- postgres | 3318163526 | 50000002 backupdb | 3318163526 | 50000002 template1 | 3368163526 | 2 template0 | 3368163526 | 2 testdb | 3318163526 | 50000002 pgbench | 3318163526 | 50000002 I don't understand why template0 and template1 are two transactions old (I did two read-write transactions) while all other databases including the vacuumed postgres are 50 millions old. Clearly that number is the manual vacuum freeze age: testdb=> show vacuum_freeze_min_age; vacuum_freeze_min_age ----------------------- 50000000 but (i) why templates have different values and (ii) why vacuuming a database has changed the situation of all the other databases? I digged in the logs, and at server restart (after single user mode), I found a lot of rows related to "aggressive automatic vacuum" that involved all databases, including templtes: % sudo grep automatic $PGDATA/log/postgresql.log | grep template1 | head -n 5 LOG: automatic aggressive vacuum to prevent wraparound of table "template1.pg_catalog.pg_statistic": index scans: 0 LOG: automatic aggressive vacuum to prevent wraparound of table "template1.pg_catalog.pg_type": index scans: 0 LOG: automatic aggressive vacuum to prevent wraparound of table "template1.pg_catalog.pg_foreign_server": index scans: 0 LOG: automatic aggressive vacuum to prevent wraparound of table "template1.pg_catalog.pg_authid": index scans: 0 LOG: automatic aggressive vacuum to prevent wraparound of table "template1.pg_catalog.pg_statistic_ext_data": index scans: 0 and similar rows about other databases exist. Again, I'm not able to figure out the differences in ages then. Thanks, Luca
On Thu, Mar 18, 2021 at 12:14 PM Luca Ferrari <fluca1978@gmail.com> wrote: > testdb=> select datname, datfrozenxid, age(datfrozenxid) from pg_database; > datname | datfrozenxid | age > -----------+--------------+---------- > postgres | 3318163526 | 50000002 > backupdb | 3318163526 | 50000002 > template1 | 3368163526 | 2 > template0 | 3368163526 | 2 > testdb | 3318163526 | 50000002 > pgbench | 3318163526 | 50000002 > I did it again: I provoked another wraparound and entered the single user mode to vacuum. This is the situation before: backend> select age(datfrozenxid), datname from pg_database 1: age (typeid = 23, len = 4, typmod = -1, byval = t) 2: datname (typeid = 19, len = 64, typmod = -1, byval = f) ---- 1: age = "2146483647" (typeid = 23, len = 4, typmod = -1, byval = t) 2: datname = "postgres" (typeid = 19, len = 64, typmod = -1, byval = f) ---- 1: age = "2146483647" (typeid = 23, len = 4, typmod = -1, byval = t) 2: datname = "backupdb" (typeid = 19, len = 64, typmod = -1, byval = f) ---- 1: age = "2146483647" (typeid = 23, len = 4, typmod = -1, byval = t) 2: datname = "template1" (typeid = 19, len = 64, typmod = -1, byval = f) ---- 1: age = "2146483647" (typeid = 23, len = 4, typmod = -1, byval = t) 2: datname = "template0" (typeid = 19, len = 64, typmod = -1, byval = f) ---- 1: age = "2146483647" (typeid = 23, len = 4, typmod = -1, byval = t) 2: datname = "testdb" (typeid = 19, len = 64, typmod = -1, byval = f) ---- 1: age = "2146483647" (typeid = 23, len = 4, typmod = -1, byval = t) 2: datname = "pgbench" (typeid = 19, len = 64, typmod = -1, byval = f) backend> vacuum 2021-03-20 11:54:44.878 CET [87179] WARNING: database "backupdb" must be vacuumed within 1000000 transactions 2021-03-20 11:54:44.878 CET [87179] HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. and after the vacuum I was able to start other in normal mode, without having to vacuum again another database as suggested by the HINT. However, templates have a zero age instead of the other databases: % psql -U luca -c "SELECT age(datfrozenxid), datname from pg_database;" testdb age | datname ----------+----------- 50000000 | postgres 50000000 | backupdb 0 | template1 0 | template0 50000000 | testdb 50000000 | pgbench I suspect freezing is doing it "totally" for a idatistemplate database, even if I don't understand why.
On Sat, Mar 20, 2021 at 12:07 PM Luca Ferrari <fluca1978@gmail.com> wrote: > I suspect freezing is doing it "totally" for a idatistemplate > database, even if I don't understand why. I can confirm that freezing a template database is done by means of setting it age to zero. I've set the datistempalte flag for testdb and reissued a wraparong (I'm torturing my postgresql!), and after a vacuum its age went to zero. I'm not able to find this behavior in the documentation however, and still don't understand why a template database should have a different behavior (at least, I can imagine only to reduce the future workload of vacuuming a template database). Here it is, again, what I did in single user mode: backend> select datname, age( datfrozenxid), current_setting( 'vacuum_freeze_min_age' ) from pg_database; 1: datname (typeid = 19, len = 64, typmod = -1, byval = f) 2: age (typeid = 23, len = 4, typmod = -1, byval = t) 3: current_setting (typeid = 25, len = -1, typmod = -1, byval = f) ---- 1: datname = "postgres" (typeid = 19, len = 64, typmod = -1, byval = f) 2: age = "2138438218" (typeid = 23, len = 4, typmod = -1, byval = t) 3: current_setting = "50000000" (typeid = 25, len = -1, typmod = -1, byval = f) ---- 1: datname = "backupdb" (typeid = 19, len = 64, typmod = -1, byval = f) 2: age = "2138438218" (typeid = 23, len = 4, typmod = -1, byval = t) 3: current_setting = "50000000" (typeid = 25, len = -1, typmod = -1, byval = f) ---- 1: datname = "template1" (typeid = 19, len = 64, typmod = -1, byval = f) 2: age = "2138438218" (typeid = 23, len = 4, typmod = -1, byval = t) 3: current_setting = "50000000" (typeid = 25, len = -1, typmod = -1, byval = f) ---- 1: datname = "template0" (typeid = 19, len = 64, typmod = -1, byval = f) 2: age = "2138438218" (typeid = 23, len = 4, typmod = -1, byval = t) 3: current_setting = "50000000" (typeid = 25, len = -1, typmod = -1, byval = f) ---- 1: datname = "testdb" (typeid = 19, len = 64, typmod = -1, byval = f) 2: age = "2138438218" (typeid = 23, len = 4, typmod = -1, byval = t) 3: current_setting = "50000000" (typeid = 25, len = -1, typmod = -1, byval = f) ---- 1: datname = "pgbench" (typeid = 19, len = 64, typmod = -1, byval = f) 2: age = "2138438218" (typeid = 23, len = 4, typmod = -1, byval = t) 3: current_setting = "50000000" (typeid = 25, len = -1, typmod = -1, byval = f) ---- backend> set vacuum_freeze_min_age to 1234 backend> vacuum WARNING: database "backupdb" must be vacuumed within 9045429 transactions HINT: To avoid a database shutdown, execute a database-wide VACUUM in that database. You might also need to commit or roll back old prepared transactions, or drop stale replication slots. backend> and here it is the situation after a restart: testdb=> select datname, age( datfrozenxid ) from pg_database; datname | age -----------+---------- postgres | 1234 backupdb | 50000000 template1 | 0 template0 | 0 testdb | 0 pgbench | 50000000
On Mon, Mar 22, 2021 at 8:56 AM Luca Ferrari <fluca1978@gmail.com> wrote: > backend> select datname, age( datfrozenxid), current_setting( > 1: datname = "template0" (typeid = 19, len = 64, typmod > = -1, byval = f) > 2: age = "2138438218" (typeid = 23, len = 4, typmod = -1, byval = t) > 3: current_setting = "50000000" (typeid = 25, len = > -1, typmod = -1, byval = f) > ---- > 1: datname = "testdb" (typeid = 19, len = 64, typmod = -1, byval = f) > 2: age = "2138438218" (typeid = 23, len = 4, typmod = -1, byval = t) > 3: current_setting = "50000000" (typeid = 25, len = > -1, typmod = -1, byval = f) Another thing that comes into my mind as a doubt is: why are all databases becoming old? I mean, I'm provoking activity _only_ on testdb, therefore other database such as template1 are not doing anything. Why an emergency autovacuum is not freezing such databases before the wraparound happens? Luca
On Mon, 2021-03-29 at 10:33 +0200, Luca Ferrari wrote: > Another thing that comes into my mind as a doubt is: why are all > databases becoming old? I mean, I'm provoking activity _only_ on > testdb, therefore other database such as template1 are not doing > anything. That one I can answer. If there is no activity on a database, its "datfrozenxid" stays the same. So, as transaction IDs are consumed, it is getting older automatically. That means that even inactive databases will receive an anti-wraparound vacuum occasionally. But that should not have to do anything except advance "datfrozenxid". Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Mon, Mar 29, 2021 at 11:14 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > If there is no activity on a database, its "datfrozenxid" stays > the same. So, as transaction IDs are consumed, it is getting older > automatically. That means that even inactive databases will receive > an anti-wraparound vacuum occasionally. But that should not have > to do anything except advance "datfrozenxid". Thanks, but this is exactly my point: since inactive databases are getting older, why it appears to me that autovacuum is not freezing them? I mean, in my experiment age( datfrozenzid) reports the same age for every database, even the inactive ones. And since there were inactive databases, I was expecting emergency autovacuum to be able to run and freeze them. Or am I wrong? Luca
On Mon, 2021-03-29 at 16:58 +0200, Luca Ferrari wrote: > > If there is no activity on a database, its "datfrozenxid" stays > > the same. So, as transaction IDs are consumed, it is getting older > > automatically. That means that even inactive databases will receive > > an anti-wraparound vacuum occasionally. But that should not have > > to do anything except advance "datfrozenxid". > > Thanks, but this is exactly my point: since inactive databases are > getting older, why it appears to me that autovacuum is not freezing > them? I mean, in my experiment age( datfrozenzid) reports the same age > for every database, even the inactive ones. And since there were > inactive databases, I was expecting emergency autovacuum to be able to > run and freeze them. Or am I wrong? I didn't follow the rest of the thread, but autovacuum should handle those databases and advance their "datfrozenxid". Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Mon, Mar 29, 2021 at 7:12 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > I didn't follow the rest of the thread, but autovacuum should handle > those databases and advance their "datfrozenxid". That did not happen. In short: I turned off autovacuum globally, then in a database I consumed a lot of xid while keeping a table locked. This produced a wraparound in such database, but I was expecting emergency autovacuum to keep other databases frozen. PostgreSQL 12.5. Luca
Hi Luca, On Mon, 22 Mar 2021 08:56:46 +0100 Luca Ferrari <fluca1978@gmail.com> wrote: > I can confirm that freezing a template database is done by means of setting > it age to zero. [...] > and here it is the situation after a restart: > > testdb=> select datname, age( datfrozenxid ) from pg_database; > datname | age > -----------+---------- > postgres | 1234 > backupdb | 50000000 > template1 | 0 > template0 | 0 > testdb | 0 > pgbench | 50000000 The difference between a "vacuum" and "vacuum freeze" is whether the vacuum process must scan non-frozen blocks as well, according to the visibility map. The later is called "aggressive" vacuum because it scan all blocks, even the clean ones, as far as they are not already marked as frozen in the visibility map. Whatever the vacuum you are launching, if the process find a rows older than vacuum_freeze_min_age, it freezes it. Agressive vacuum is not different and respect vacuum_freeze_min_age as well. That's why your oldest row in each database is 50000000 after a "vacuum freeze" (aka. aggressive vacuum, aka. "vacuum to avoid wraparound"). Try to temporary set vacuum_freeze_min_age=45000000 and freeze_table_age=0, then run a simple vacuum on your database. This will effectively freeze you database and set its age to the oldest row: your new vacuum_freeze_min_age, 45000000. Considering the template databases, the default vacuum_freeze_min_age is forced to 0 in source code. That's why you find a different age between template databases and others after a freeze. In regard with the databases age moving altogether, even when only one of them is receiving writes. The XID space is shared between all the database. In other words, a transaction can not be used in two different database, unless they apply on shared relations (those in pg_global tablespace) and probably cloned ones from templates. So if a database alone is consuming XIDs all other are getting older and older and will eventually need a vacuum. > I'm not able to find this behavior in the documentation however, I don't know if it is explained somewhere in doc, I couldn't find it either. But you can find this information in function "do_autovacuum()" in src/backend/postmaster/autovacuum.c: /* * Find the pg_database entry and select the default freeze ages. We use * zero in template and nonconnectable databases, else the system-wide * default. */ tuple = SearchSysCache1(DATABASEOID, ObjectIdGetDatum(MyDatabaseId)); if (!HeapTupleIsValid(tuple)) elog(ERROR, "cache lookup failed for database %u", MyDatabaseId); dbForm = (Form_pg_database) GETSTRUCT(tuple); if (dbForm->datistemplate || !dbForm->datallowconn) { default_freeze_min_age = 0; default_freeze_table_age = 0; default_multixact_freeze_min_age = 0; default_multixact_freeze_table_age = 0; } else { default_freeze_min_age = vacuum_freeze_min_age; default_freeze_table_age = vacuum_freeze_table_age; default_multixact_freeze_min_age = vacuum_multixact_freeze_min_age; default_multixact_freeze_table_age = vacuum_multixact_freeze_table_age; } > and still don't understand why a template database should have a different > behavior (at least, I can imagine only to reduce the future workload of > vacuuming a template database). I suspect that it helps creating database with already frozen blocs, from a full frozen template. Regards,
On Thu, 18 Mar 2021 09:56:16 +0100 Luca Ferrari <fluca1978@gmail.com> wrote: [...] > Therefore my question is: shouldn't autovacuum be able to freeze other > tables/databases? I mean, the wraparound problem in this scenario will > cause problems, but I was expecting different numbers for different > tables/databases. In fact, when an autovacuum worker is spawned, here is how it chooses what database to process: 1. look for any database needing a vacuum to prevent a wraparound. 2. same with multi-transaction 3. other autovacuum considerations So as long as there's a database in desperate need for a vacuum to prevent a wraparound, a worker will try to process it first, again and again. Because of your long-running transaction, the xid horizon forbid to update the rel/datfrozenxid. So next autovacuum round will keep trying to process the same database, ignoring others. Look at the comment in function "do_stat_worker()" in autovacuum.c for more details: https://git.postgresql.org/cgit/postgresql.git/tree/src/backend/postmaster/autovacuum.c#n1207 When looping over the database list, as soon as "for_xid_wrap" is true, any other database is ignored. Then a new worker is popped from the freeWorkers, init'ed with the database to freeze and started. So as far as I understand the code (I might easily be wrong), all the workers will keep trying to process the same database again and again without considering other ones. All because of your really-long living xact. Regards,
On Fri, Apr 2, 2021 at 10:29 AM Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote: > > On Thu, 18 Mar 2021 09:56:16 +0100 > Luca Ferrari <fluca1978@gmail.com> wrote: > [...] > > Therefore my question is: shouldn't autovacuum be able to freeze other > > tables/databases? I mean, the wraparound problem in this scenario will > > cause problems, but I was expecting different numbers for different > > tables/databases. > > In fact, when an autovacuum worker is spawned, here is how it chooses what > database to process: > > 1. look for any database needing a vacuum to prevent a wraparound. > 2. same with multi-transaction > 3. other autovacuum considerations > > So as long as there's a database in desperate need for a vacuum to prevent a > wraparound, a worker will try to process it first, again and again. > > Because of your long-running transaction, the xid horizon forbid to update the > rel/datfrozenxid. So next autovacuum round will keep trying to process the same > database, ignoring others. > > Look at the comment in function "do_stat_worker()" in autovacuum.c for more > details: > https://git.postgresql.org/cgit/postgresql.git/tree/src/backend/postmaster/autovacuum.c#n1207 > > When looping over the database list, as soon as "for_xid_wrap" is true, any > other database is ignored. Then a new worker is popped from the freeWorkers, > init'ed with the database to freeze and started. So as far as I understand the > code (I might easily be wrong), all the workers will keep trying to process the > same database again and again without considering other ones. All because of > your really-long living xact. This is a damn good and complete explanation that solved, so far, all my doubts. Or, ehm, there is one last: why having a TransactionId that is 32 bits in depth while it is exposed (thru txid_current()) as a 64 bits value? I mean, having 64 bits would reduce the need for anti-wrap arpund vacuum. I suspect the usage of 32 bits is both for compatibility and tuple header size, but I'm just guessing. Thanks, Luca
On Sat, 2021-04-03 at 15:22 +0200, Luca Ferrari wrote: > why having a TransactionId that is 32 bits > in depth while it is exposed (thru txid_current()) as a 64 bits value? > I mean, having 64 bits would reduce the need for anti-wrap arpund > vacuum. I suspect the usage of 32 bits is both for compatibility and > tuple header size, but I'm just guessing. Because there are two of these transaction IDs stored on each tuple (xmin and xmax) to determine its visibility. The overhead of 8 bytes per tuples for visibility is already pretty high. Another downside is that changing this would prevent the use of pg_upgrade for upgrading, as the on-disk format changes. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Tue, 06 Apr 2021 10:46:08 +0200 Laurenz Albe <laurenz.albe@cybertec.at> wrote: > On Sat, 2021-04-03 at 15:22 +0200, Luca Ferrari wrote: > > why having a TransactionId that is 32 bits > > in depth while it is exposed (thru txid_current()) as a 64 bits value? > > I mean, having 64 bits would reduce the need for anti-wrap arpund > > vacuum. I suspect the usage of 32 bits is both for compatibility and > > tuple header size, but I'm just guessing. > > Because there are two of these transaction IDs stored on each tuple > (xmin and xmax) to determine its visibility. The overhead of 8 bytes > per tuples for visibility is already pretty high. > > Another downside is that changing this would prevent the use of > pg_upgrade for upgrading, as the on-disk format changes. Indeed. Compatibility and size. About the txid_current() format. It is showing the 64bit format used internally. It is split in two parts: * lower part is classical XID on 32 bits, stored in tuples header * higher part is the "epoch" of the xid space, ie. the number of time the XID looped. This is not stored in tuples Look for macros EpochFromFullTransactionId and XidFromFullTransactionId in "include/access/transam.h". Given txid_current() returning eg. 10000000000: =# select txid_current(); txid_current -------------- 10000000000 The epoch would be 2: $ echo $((10000000000 >> 32)) 2 The 32bits XID stored in tuples header would be 1410065408: $ echo $((10000000000 % 2**32)) 1410065408 When looking at the NextXID in the controldata file, you would find: $ pg_controldata $PGDATA|grep NextXID Latest checkpoint's NextXID: 2:1410065408 Regards,