Currently the fact that it needs to go back to old tables and FTS them every 2B transactions (or rely on autovacuum for this) and you can't do anything about it (like permanently freeze the tables) seems like a big scalability issue. Does it not?
Unfortunately it's not terribly easy to fix this. The problem is if we try to play games here, we must have a 100% reliable method for changing relfrozenxid as soon as someone inserts a new tuple in the relation. It might be possible to tie this into the visibility map, but no one has looked at this yet.
Perhaps you'd be willing to investigate this, or sponsor the work?
I'll see what I can do. Will talk to folks at pgDay in a month.
Oh, there is another possibility that's been discussed: read-only tables. If we had the ability to mark a table read-only, then a VACUUM FREEZE on such a table would be able to set that table's relfrozenxid to FrozenTransactionId and prevent any further attempts at vacuuming. This might be easier than trying to do something automatic.
I think if we could log "last update/delete/insert" timestamp for a table - we could use that to freeze tables that are not changed. I also wonder how pg_database.datfrozenxid is set? Is it equal to the oldest pg_class.relfrozenxid for that database?
I ask because I am willing to give a try and update relfrozenxid for the tables that are never updated and frozen. Currently we are looking at 8-hour downtime to vacuum the whole db in single-user mode. High availability is more important that data loss in my case. [I still don't want to lose data, but it won't be the end of world if it happens].
Having read-only tables would be great.
I was able to get great performance from unlogged tables, similarly read-only tables would be able to address issue with high-transactions and many large stale tables.
-- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com