On 3/30/16 4:18 AM, Magnus Hagander wrote:
>
> On Wed, Mar 30, 2016 at 4:10 AM, David Steele <david@pgmasters.net
> <mailto:david@pgmasters.net>> wrote:
>
> This certainly looks like it would work but it raises the barrier for
> implementing backups by quite a lot. It's fine for backrest or barman
> but it won't be pleasant for anyone who has home-grown scripts.
>
>
> How much does it really raise the bar, though?
>
> It would go from "copy all files and make damn sure you copy pg_control
> last, and rename it to pg_control.backup" to "take this binary blob you
> got from the server and write it to pg_control.backup"?
>
> Also, the target of these APIs is specifically the backup tools and not
> homewritten scripts.
Then what would home-grown scripts use, the deprecated API that we know
has issues?
> A simple shellscript will have trouble enough using
> it in the first place since it requires a persistent connection to the
> database.
All that's required is to spawn a psql process. I'm no shell expert but
that's simple enough.
> But those scripts are likely broken anyway.
Yes, probably. Backup and especially archiving correctly are harder
than most people realize.
> <...>
>
> The main reason for Heikki to suggest this one over the other basic one
> is that it brings protection against the "backup script/program crashed
> halfway through but the user still tried to restore from that". They will
> outright fail because there is no pg_control.backup in that case.
But if we are going to make this complicated I'm not sure it's a big
deal just to require pg_control to be copied last. pgBackRest already
does that and Barman probably does, too.
I don't see "don't copy pg_control" and "copy pg_control last" as all
that different in terms of complexity.
pgBackRest also *restores* pg_control last which I think is pretty
important and would not be addressed by this solution.
> If we
> don't care about that, then we can go back to just saying "copy
> pg_control last and we're done". But you yourself complained about that
> requirement because it's too easy to get wrong (though you advocated
> using backup_label to transfer the data over -- but that has the
> potential for getting more complicated if we now or at any point in the
> future want more than one field to transfer, for example).
Perhaps. I'm still not convinced that getting some information from
backup_label and other information from pg_control is a good solution.
I would rather write the recovery point into the backup_label and use
that instead of the value in pg_control. Text files are much easier to
parse and push around accurately (and test).
--
-David
david@pgmasters.net