On Tue, Dec 5, 2017 at 11:01 PM, Bruce Momjian <bruce@momjian.us> wrote:
As part of PGConf.Asia 2017 in Tokyo, we had an unconference topic about zero-downtime upgrades. After the usual discussion of using logical replication, Slony, and perhaps having the server be able to read old and new system catalogs, we discussed speeding up pg_upgrade.
There are clusters that take a long time to dump the schema from the old cluster and recreate it in the new cluster. One idea of speeding up pg_upgrade would be to allow pg_upgrade to be run in two stages:
1. prevent system catalog changes while the old cluster is running, and dump the old cluster's schema and restore it in the new cluster
2. shut down the old cluster and copy/link the data files
When we were discussing this, I was thinking that the linking could be done in phase 1 too, as that's potentially slow on a very large schema.
My question is whether the schema dump/restore is time-consuming enough to warrant this optional more complex API, and whether people would support adding a server setting that prevented all system table changes?
I've certainly heard of cases where pg_upgrade takes significant amounts of time to run on very complex databases.