On Tue, Jul 6, 2021 at 10:58 AM Bruce Momjian <bruce@momjian.us> wrote:
> Well, pg_upgrade corruptions are rare, but so is modifying
> autovacuum_freeze_max_age. If we have a corruption and we know
> autovacuum_freeze_max_age was modified, odds are that is the cause.
My point is that there isn't necessarily that much use in trying to
determine what really happened here. It would be nice to know for
sure, but it shouldn't affect what we do about the bug.
In a way the situation with the bug is simple. Clearly Tom wasn't
thinking of pg_upgrade when he wrote the relevant pg_resetwal code
that sets oldestXid, because pg_upgrade didn't exist at the time. He
was thinking of restoring the database to a relatively sane state in
the event of some kind of corruption, with all of the uncertainty that
goes with that. Nobody noticed that pg_upgrade gets this same behavior
until much more recently.
Now that we see the problem laid out, there isn't much to think about
that will affect the response to the issue. At least as far as I can
tell. We know that pg_upgrade uses pg_resetwal's -x flag in a context
where there is no reason at all to think that the database is corrupt
-- Tom can't have anticipated that all those years ago. It's easy to
see that the behavior is wrong for pg_upgrade, and it's very hard to
imagine any way in which it might have accidentally made some sense
all along. We should just carry forward the original oldestXid.
--
Peter Geoghegan