Re: trying again to get incremental backup - Mailing list pgsql-hackers

From Robert Haas
Subject Re: trying again to get incremental backup
Date
Msg-id CA+TgmoaGeZ=vWDYvWtPY-sP+6594HqLScvPMSSunyuYZVSWMpw@mail.gmail.com
Whole thread Raw
In response to Re: trying again to get incremental backup  (Jakub Wartak <jakub.wartak@enterprisedb.com>)
Responses Re: trying again to get incremental backup
List pgsql-hackers
On Fri, Dec 15, 2023 at 5:36 AM Jakub Wartak
<jakub.wartak@enterprisedb.com> wrote:
> I've played with with initdb/pg_upgrade (17->17) and i don't get DBID
> mismatch (of course they do differ after initdb), but i get this
> instead:
>
>  $ pg_basebackup -c fast -D /tmp/incr2.after.upgrade -p 5432
> --incremental /tmp/incr1.before.upgrade/backup_manifest
> WARNING:  aborting backup due to backend exiting before pg_backup_stop
> was called
> pg_basebackup: error: could not initiate base backup: ERROR:  timeline
> 2 found in manifest, but not in this server's history
> pg_basebackup: removing data directory "/tmp/incr2.after.upgrade"
>
> Also in the manifest I don't see DBID ?
> Maybe it's a nuisance and all I'm trying to see is that if an
> automated cronjob with pg_basebackup --incremental hits a freshly
> upgraded cluster, that error message without errhint() is going to
> scare some Junior DBAs.

Yeah. I think we should add the system identifier to the manifest, but
I think that should be left for a future project, as I don't think the
lack of it is a good reason to stop all progress here. When we have
that, we can give more reliable error messages about system mismatches
at an earlier stage. Unfortunately, I don't think that the timeline
messages you're seeing here are going to apply in every case: suppose
you have two unrelated servers that are both on timeline 1. I think
you could use a base backup from one of those servers and use it as
the basis for the incremental from the other, and I think that if you
did it right you might fail to hit any sanity check that would block
that. pg_combinebackup will realize there's a problem, because it has
the whole cluster to work with, not just the manifest, and will notice
the mismatching system identifiers, but that's kind of late to find
out that you made a big mistake. However, right now, it's the best we
can do.

> The incrementals are being generated , but just for the first (0)
> segment of the relation?

I committed the first two patches from the series I posted yesterday.
The first should fix this, and the second relocates parse_manifest.c.
That patch hasn't changed in a while and seems unlikely to attract
major objections. There's no real reason to commit it until we're
ready to move forward with the main patches, but I think we're very
close to that now, so I did.

Here's a rebase for cfbot.

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment

pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Change GUC hashtable to use simplehash?
Next
From: "Tristan Partin"
Date:
Subject: Add support for __attribute__((returns_nonnull))