Re: Add notes to pg_combinebackup docs - Mailing list pgsql-hackers
From | David Steele |
---|---|
Subject | Re: Add notes to pg_combinebackup docs |
Date | |
Msg-id | 05fb32c9-18d8-4f72-9af3-f41576c33119@pgmasters.net Whole thread Raw |
In response to | Re: Add notes to pg_combinebackup docs (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: Add notes to pg_combinebackup docs
|
List | pgsql-hackers |
On 4/9/24 19:44, Tomas Vondra wrote: > > On 4/9/24 09:59, Martín Marqués wrote: >> Hello, >> >> While doing some work/research on the new incremental backup feature >> some limitations were not listed in the docs. Mainly the fact that >> pg_combienbackup works with plain format and not tar. >> > > Right. The docs mostly imply this by talking about output directory and > backup directories, but making it more explicit would not hurt. > > FWIW it'd be great if we could make incremental backups work with tar > format in the future too. People probably don't want to keep around the > expanded data directory or extract everything before combining the > backups is not very convenient. Reading and writing the tar would make > this simpler. I have a hard time seeing this feature as being very useful, especially for large databases, until pg_combinebackup works on tar (and compressed tar). Right now restoring an incremental requires at least twice the space of the original cluster, which is going to take a lot of users by surprise. I know you have made some improvements here for COW filesystems, but my experience is that Postgres is generally not run on such filesystems, though that is changing a bit. >> Around the same time, Tomas Vondra tested incremental backups with a >> cluster where he enabled checksums after taking the previous full >> backup. After combining the backups the synthetic backup had pages >> with checksums and other pages without checksums which ended in >> checksum errors. > > I'm not sure just documenting this limitation is sufficient. We can't > make the incremental backups work in this case (it's as if someone > messes with cluster without writing stuff into WAL), but I think we > should do better than silently producing (seemingly) corrupted backups. > > I say seemingly, because the backup is actually fine, the only problem > is it has checksums enabled in the controlfile, but the pages from the > full backup (and the early incremental backups) have no checksums. > > What we could do is detect this in pg_combinebackup, and either just > disable checksums with a warning and hint to maybe enable them again. Or > maybe just print that the user needs to disable them. > > I was thinking maybe we could detect this while taking the backups, and > force taking a full backup if checksums got enabled since the last > backup. But we can't do that because we only have the manifest from the > last backup, and the manifest does not include info about checksums. I'd say making a new full backup is the right thing to do in this case. It should be easy enough to store the checksum state of the cluster in the manifest. Regards, -David
pgsql-hackers by date: