Re: Updated backup APIs for non-exclusive backups - Mailing list pgsql-hackers

From David Steele
Subject Re: Updated backup APIs for non-exclusive backups
Date
Msg-id 56FC84BF.1050401@pgmasters.net
Whole thread Raw
In response to Re: Updated backup APIs for non-exclusive backups  (Magnus Hagander <magnus@hagander.net>)
Responses Re: Updated backup APIs for non-exclusive backups
List pgsql-hackers
On 3/30/16 4:18 AM, Magnus Hagander wrote:
> 
> On Wed, Mar 30, 2016 at 4:10 AM, David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
> 
>     This certainly looks like it would work but it raises the barrier for
>     implementing backups by quite a lot.  It's fine for backrest or barman
>     but it won't be pleasant for anyone who has home-grown scripts.
> 
> 
> How much does it really raise the bar, though?
> 
> It would go from "copy all files and make damn sure you copy pg_control
> last, and rename it to pg_control.backup" to "take this binary blob you
> got from the server and write it to pg_control.backup"?
> 
> Also, the target of these APIs is specifically the backup tools and not
> homewritten scripts.

Then what would home-grown scripts use, the deprecated API that we know
has issues?

> A simple shellscript will have trouble enough using
> it in the first place since it requires a persistent connection to the
> database. 

All that's required is to spawn a psql process.  I'm no shell expert but
that's simple enough.

> But those scripts are likely broken anyway.

Yes, probably.  Backup and especially archiving correctly are harder
than most people realize.

> <...>
> 
> The main reason for Heikki to suggest this one over the other basic one
> is that it brings protection against the "backup script/program crashed
> halfway through but the user still tried to restore from that". They will
> outright fail because there is no pg_control.backup in that case.

But if we are going to make this complicated I'm not sure it's a big
deal just to require pg_control to be copied last.  pgBackRest already
does that and Barman probably does, too.

I don't see "don't copy pg_control" and "copy pg_control last" as all
that different in terms of complexity.

pgBackRest also *restores* pg_control last which I think is pretty
important and would not be addressed by this solution.

> If we
> don't care about that, then we can go back to just saying "copy
> pg_control last and we're done". But you yourself complained about that
> requirement because it's too easy to get wrong (though you advocated
> using backup_label to transfer the data over -- but that has the
> potential for getting more complicated if we now or at any point in the
> future want more than one field to transfer, for example).

Perhaps.  I'm still not convinced that getting some information from
backup_label and other information from pg_control is a good solution.
I would rather write the recovery point into the backup_label and use
that instead of the value in pg_control.  Text files are much easier to
parse and push around accurately (and test).

-- 
-David
david@pgmasters.net



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Correction for replication slot creation error message in 9.6
Next
From: Alvaro Herrera
Date:
Subject: Re: snapshot too old, configured by time