Re: recovering from "found xmin ... from before relfrozenxid ..." - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: recovering from "found xmin ... from before relfrozenxid ..." |
Date | |
Msg-id | 849990.1600621996@sss.pgh.pa.us Whole thread Raw |
In response to | Re: recovering from "found xmin ... from before relfrozenxid ..." (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: recovering from "found xmin ... from before relfrozenxid ..."
Re: recovering from "found xmin ... from before relfrozenxid ..." Re: recovering from "found xmin ... from before relfrozenxid ..." Re: recovering from "found xmin ... from before relfrozenxid ..." |
List | pgsql-hackers |
I wrote: > So this confirms the suspicion that the cause of the buildfarm > failures is a concurrently-open transaction, presumably from > autovacuum. I don't have time to poke further right now. I spent some more time analyzing this, and there seem to be two distinct issues: 1. My patch a7212be8b does indeed have a problem. It will allow vacuum_set_xid_limits to compute freezeLimit = nextXid for a temp table if freeze_min_age is zero (ie VACUUM FREEZE). If there's any concurrent transactions, this falls foul of heap_prepare_freeze_tuple's expectation that * NB: cutoff_xid *must* be <= the current global xmin, to ensure that any * XID older than it could neither be running nor seen as running by any * open transaction. This ensures that the replacement will not change * anyone's idea of the tuple state. The "cannot freeze committed xmax" error message appears to be banking on the assumption that we'd not reach heap_prepare_freeze_tuple for any committed-dead tuple unless its xmax is past the specified cutoff_xid. 2. As Amit suspected, there's an inconsistency between pruneheap.c's rules for which tuples are removable and vacuum.c's rules for that. This seems like a massive bug in its own right: what's the point of pruneheap.c going to huge effort to decide whether it should keep a tuple if vacuum will then kill it anyway? I do not understand why whoever put in the GlobalVisState stuff only applied it in pruneheap.c and not VACUUM proper. These two points interact, in that we don't get to the "cannot freeze" failure except when pruneheap has decided not to remove something that is removable according to VACUUM's rules. (VACUUM doesn't actually remove it, because lazy_scan_heap won't try to remove HeapOnly tuples even when it thinks they're HEAPTUPLE_DEAD; but then it tries to freeze the tuple, and heap_prepare_freeze_tuple spits up.) However, if I revert a7212be8b then the pg_surgery test still fails in the presence of a concurrent transaction (both of the expected "skipping TID" notices disappear). So reverting that patch wouldn't get us out of trouble. I think to move forward, we need to figure out what the freezing behavior ought to be for temp tables. We could make it the same as it was before a7212be8b, which'd just require some more complexity in vacuum_set_xid_limits. However, that negates the idea that we'd like VACUUM's behavior on a temp table to be fully independent of whether concurrent transactions exist. I'd prefer to allow a7212be8b's behavior to stand, but then it seems we need to lobotomize the error check in heap_prepare_freeze_tuple to some extent. Independently of that, it seems like we need to fix things so that when pruneheap.c is called by vacuum, it makes EXACTLY the same dead-or-not-dead decisions that the main vacuum code makes. This business with applying some GlobalVisState rule or other instead seems just as unsafe as can be. AFAICS, there is no chance of the existing pg_surgery regression test being fully stable if we don't fix both things. BTW, attached is a quick-hack patch to allow automated testing of this scenario, along the lines I sketched yesterday. This test passes if you run the two scripts serially, but not when you run them in parallel. I'm not proposing this for commit; it's a hack and its timing behavior is probably not stable enough for the buildfarm. But it's pretty useful for poking at these issues. regards, tom lane diff --git a/contrib/pg_surgery/Makefile b/contrib/pg_surgery/Makefile index a66776c4c4..a7bcfca0fb 100644 --- a/contrib/pg_surgery/Makefile +++ b/contrib/pg_surgery/Makefile @@ -9,7 +9,7 @@ EXTENSION = pg_surgery DATA = pg_surgery--1.0.sql PGFILEDESC = "pg_surgery - perform surgery on a damaged relation" -REGRESS = heap_surgery +REGRESS = --schedule=surgery_schedule ifdef USE_PGXS PG_CONFIG = pg_config diff --git a/contrib/pg_surgery/expected/heap_surgery.out b/contrib/pg_surgery/expected/heap_surgery.out index d4a757ffa0..6ef5597e7c 100644 --- a/contrib/pg_surgery/expected/heap_surgery.out +++ b/contrib/pg_surgery/expected/heap_surgery.out @@ -1,4 +1,10 @@ create extension pg_surgery; +select pg_sleep(0.1); -- ensure concurrent transaction is ready + pg_sleep +---------- + +(1 row) + -- create a normal heap table and insert some rows. -- use a temp table so that vacuum behavior doesn't depend on global xmin create temp table htab (a int); diff --git a/contrib/pg_surgery/expected/sleep.out b/contrib/pg_surgery/expected/sleep.out new file mode 100644 index 0000000000..208e31163d --- /dev/null +++ b/contrib/pg_surgery/expected/sleep.out @@ -0,0 +1,6 @@ +select pg_sleep(1.0); -- long enough for whole heap_surgery test + pg_sleep +---------- + +(1 row) + diff --git a/contrib/pg_surgery/sql/heap_surgery.sql b/contrib/pg_surgery/sql/heap_surgery.sql index 6526b27535..212c657f3c 100644 --- a/contrib/pg_surgery/sql/heap_surgery.sql +++ b/contrib/pg_surgery/sql/heap_surgery.sql @@ -1,5 +1,7 @@ create extension pg_surgery; +select pg_sleep(0.1); -- ensure concurrent transaction is ready + -- create a normal heap table and insert some rows. -- use a temp table so that vacuum behavior doesn't depend on global xmin create temp table htab (a int); diff --git a/contrib/pg_surgery/sql/sleep.sql b/contrib/pg_surgery/sql/sleep.sql new file mode 100644 index 0000000000..a85c25bca2 --- /dev/null +++ b/contrib/pg_surgery/sql/sleep.sql @@ -0,0 +1 @@ +select pg_sleep(1.0); -- long enough for whole heap_surgery test diff --git a/contrib/pg_surgery/surgery_schedule b/contrib/pg_surgery/surgery_schedule new file mode 100644 index 0000000000..1bb3aa5e53 --- /dev/null +++ b/contrib/pg_surgery/surgery_schedule @@ -0,0 +1 @@ +test: heap_surgery sleep
pgsql-hackers by date: