Re: Add notes to pg_combinebackup docs - Mailing list pgsql-hackers

From David Steele
Subject Re: Add notes to pg_combinebackup docs
Date
Msg-id 05fb32c9-18d8-4f72-9af3-f41576c33119@pgmasters.net
Whole thread Raw
In response to Re: Add notes to pg_combinebackup docs  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Add notes to pg_combinebackup docs
List pgsql-hackers
On 4/9/24 19:44, Tomas Vondra wrote:
> 
> On 4/9/24 09:59, Martín Marqués wrote:
>> Hello,
>>
>> While doing some work/research on the new incremental backup feature
>> some limitations were not listed in the docs. Mainly the fact that
>> pg_combienbackup works with plain format and not tar.
>>
> 
> Right. The docs mostly imply this by talking about output directory and
> backup directories, but making it more explicit would not hurt.
> 
> FWIW it'd be great if we could make incremental backups work with tar
> format in the future too. People probably don't want to keep around the
> expanded data directory or extract everything before combining the
> backups is not very convenient. Reading and writing the tar would make
> this simpler.

I have a hard time seeing this feature as being very useful, especially 
for large databases, until pg_combinebackup works on tar (and compressed 
tar). Right now restoring an incremental requires at least twice the 
space of the original cluster, which is going to take a lot of users by 
surprise.

I know you have made some improvements here for COW filesystems, but my 
experience is that Postgres is generally not run on such filesystems, 
though that is changing a bit.

>> Around the same time, Tomas Vondra tested incremental backups with a
>> cluster where he enabled checksums after taking the previous full
>> backup. After combining the backups the synthetic backup had pages
>> with checksums and other pages without checksums which ended in
>> checksum errors.
> 
> I'm not sure just documenting this limitation is sufficient. We can't
> make the incremental backups work in this case (it's as if someone
> messes with cluster without writing stuff into WAL), but I think we
> should do better than silently producing (seemingly) corrupted backups.
> 
> I say seemingly, because the backup is actually fine, the only problem
> is it has checksums enabled in the controlfile, but the pages from the
> full backup (and the early incremental backups) have no checksums.
> 
> What we could do is detect this in pg_combinebackup, and either just
> disable checksums with a warning and hint to maybe enable them again. Or
> maybe just print that the user needs to disable them.
> 
> I was thinking maybe we could detect this while taking the backups, and
> force taking a full backup if checksums got enabled since the last
> backup. But we can't do that because we only have the manifest from the
> last backup, and the manifest does not include info about checksums.

I'd say making a new full backup is the right thing to do in this case. 
It should be easy enough to store the checksum state of the cluster in 
the manifest.

Regards,
-David



pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: post-freeze damage control
Next
From: Tom Kincaid
Date:
Subject: Re: post-freeze damage control