deduplicating backup of multiple pg_dump dumps - Mailing list pgsql-admin

From Egor Duda
Subject deduplicating backup of multiple pg_dump dumps
Date
Msg-id 5f4d2bdd-e0b6-35e7-501d-e37e5e41a92f@gmail.com
Whole thread Raw
Responses Re: deduplicating backup of multiple pg_dump dumps  (Laurenz Albe <laurenz.albe@cybertec.at>)
List pgsql-admin
Hello!

I've recently tried to use borg backup (https://borgbackup.readthedocs.io/) to store multiple
PostgreSQL database dumps, and encountered a problem. Due to nondeterministic nature of pg_dump it
reorders data tables rows on each invocation, which breaks borg backup chunking and deduplication
algorithm.

This means that each next dump in backup almost never reuses data from previous dumps, and so it's
not possible to store multiple database dumps as efficiently as possible.

I wonder if there's any way to force pg_dump use some predictable ordering of data rows (for
example, by primary key, where possible) to make dumps more uniform, similar to mysqldump
--order-by-primary option?

Egor.


pgsql-admin by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: Need to check disabled constraints
Next
From: Laurenz Albe
Date:
Subject: Re: deduplicating backup of multiple pg_dump dumps