On 1/26/20 10:44 PM, Andreas Joseph Krogh wrote:
[..]
I ran uncompressed pg_dump on multiple TB+ sized databases from v8.4 servers across the LAN using 9.6 binaries on the remote server. It was quite fast. Threading was key.
the "directory format" is the only format which supports parallel dumps, if I'm not reading it wrong.
How did threading solve "between database" dump/restore for you? Did you dump to "directory format" first, then restore?
Yes.
If so, then that requires quite a bit of temp-space...
Correct. The databases are mostly compressed TIFF and PDF images in bytea fields, so having Postgres try and compress them again was
slow and used a
lot of CPU. Thus, I did uncompressed backups, and that took a
lot of scratch disk space.
(We were not only upgrading Postgres 8.4 to 9.6, but also RHEL 5.10 to 6.10, and moving to a geographically distant data center. Thus, I deemed pg_upgrade to be impractical.)
We spun up some VMs with 10 total TB in the same DC as the source (physical) servers, and I installed Pg 9.6 on these "intermediate servers", and did remote pg_dumps of the 8.4 servers. Then I installed 9.6 on the VMs in the new DC, and NFS mounted the intermediate servers' volumes and ran multi-threaded pg_restore on the new servers. They pulled the data across the WAN.
--
Angular momentum makes the world go 'round.