Re: Updated backup APIs for non-exclusive backups - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: Updated backup APIs for non-exclusive backups
Date
Msg-id CABUevExLtXvCceo7MrPJTyPbq8f+2fVhsZsY_8L7-6OMqzi0DQ@mail.gmail.com
Whole thread Raw
In response to Re: Updated backup APIs for non-exclusive backups  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers


On Mon, Apr 25, 2016 at 4:11 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
On Thu, Apr 21, 2016 at 2:32 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Wed, Apr 20, 2016 at 1:12 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>
>> On Sun, Apr 17, 2016 at 1:22 AM, Magnus Hagander <magnus@hagander.net>
>> wrote:
>> > On Wed, Apr 13, 2016 at 4:07 AM, Noah Misch <noah@leadboat.com> wrote:
>> >>
>> >> On Tue, Apr 12, 2016 at 10:08:23PM +0200, Magnus Hagander wrote:
>> >> > On Tue, Apr 12, 2016 at 8:39 AM, Noah Misch <noah@leadboat.com>
>> >> > wrote:
>> >> > > On Mon, Apr 11, 2016 at 11:22:27AM +0200, Magnus Hagander wrote:
> Yes, it's a bit annoying. But it's something you can handle. Unlike the
> problems that exist with an exclusive backup which you *cannot* handle from
> the backup script/commands.

I know many systems which are currently running well in production and
handling that problem of an exclusive backup. For example, they use
Pacemaker/Corosync to shared-disk HA configuration. PostgreSQL resource
agent for Pacemaker removes backup_label in the case of failover to
prevent the standby server from failing to recover from the crash.
This is not so neat solution, but they could live with the problem for
a long time, so far.

I'm NOT against migrating their backup scripts so that new method is used,
at the end. On the other hand, I think that we don't need to warn and rush
them to do that at least until new method will be polished.

We have not put a timeframe on the deprecation. Given that, they can expect to have multiple releases. And surely they don't use psql in the first place, but an interface that gives them more control already, don't they?

 
>> Another pain in a non-exclusive backup is to have to execute both
>> pg_start_backup() and pg_stop_backup() on the same connection.
>
>
> Pretty sure that's the same one you just listed?

Yes, the sources of those pains are the same.

I wonder if it's better to store the backup state information in shared
memory, catalog, flat file or something instead of local memory
so that we can execute pg_stop_backup() in different session from that
pg_start_backup() is called. Maybe I'm missing something basic, but
why do those functions need to be called in the same session at all?

So that we can automatically terminate backup mode if the client disappears.


> Yes, if you insist on using psql - which is not a good tool for this - then
> yes. But you could for example make it a psql script that does something
> along
> SELECT pg_start_backup();
> \! rsync ...
> SELECT pg_stop_backup()
>
> Or as said above, drive it through a pipe instead. Or use an appropriate
> client library.

So, what's the appropriate client library for new backup method?
It's better to document that.


Take your pick. *Any* client library will work fine. libpq, psycopg2, JDBC, DBD::Pg, npgsql, doesn't matter.

psql is not a client *library*, it's a client.



>> How should we write those two values to different files when
>> we execute pg_stop_backup() via psql? Whole output of
>> pg_stop_backup() should be written to a transient file and
>> it should be filtered and written to different two files by
>> using some Linux commands? This also seems to make the backup
>> script more complicated.
>
>
> Yes, you would have to do that. And yes, psql is not a good tool for this.
> So the solution is to not force yourself to use a tool that doesn't work
> well for this problem.

So it's better to document a tool which can work well or how to use psql
for new backup method (i.e., how to create backup_label and tablespace_map
files from the output of psql, of course it's also better to write
the example).

We didn't have examples before, did we?

These APIs are directed at *tool developers*. Surely they can be trusted to be able to write a file.

*End users* should not be using those APIs at all. They should be using pg_basebackup, or one of the other tools. That's the "bigger documentation rewrite" that was referenced in the thread, but nobody could commit to getting done before beta.

--

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Updated backup APIs for non-exclusive backups
Next
From: Stephen Frost
Date:
Subject: pg_dump - tar format and compression