Re: Updated backup APIs for non-exclusive backups - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Updated backup APIs for non-exclusive backups
Date
Msg-id CAHGQGwHK8xstzftpW0kAy5LRY=MmTvCHzxRzzUf3-iERNSAz9Q@mail.gmail.com
Whole thread Raw
In response to Re: Updated backup APIs for non-exclusive backups  (Magnus Hagander <magnus@hagander.net>)
Responses Re: Updated backup APIs for non-exclusive backups  (Magnus Hagander <magnus@hagander.net>)
Re: Updated backup APIs for non-exclusive backups  (Kevin Grittner <kgrittn@gmail.com>)
List pgsql-hackers
On Thu, Apr 21, 2016 at 2:32 AM, Magnus Hagander <magnus@hagander.net> wrote:
> On Wed, Apr 20, 2016 at 1:12 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>
>> On Sun, Apr 17, 2016 at 1:22 AM, Magnus Hagander <magnus@hagander.net>
>> wrote:
>> > On Wed, Apr 13, 2016 at 4:07 AM, Noah Misch <noah@leadboat.com> wrote:
>> >>
>> >> On Tue, Apr 12, 2016 at 10:08:23PM +0200, Magnus Hagander wrote:
>> >> > On Tue, Apr 12, 2016 at 8:39 AM, Noah Misch <noah@leadboat.com>
>> >> > wrote:
>> >> > > On Mon, Apr 11, 2016 at 11:22:27AM +0200, Magnus Hagander wrote:
>> >> > > > Well, if we *don't* do the rewrite before we release it, then we
>> >> > > > have to
>> >> > > > instead put information about the new version of the functions
>> >> > > > into
>> >> > > > the
>> >> > > old
>> >> > > > structure I think.
>> >> > > >
>> >> > > > So I think it's an open issue.
>> >> > >
>> >> > > Works for me...
>> >> > >
>> >> > > [This is a generic notification.]
>> >> > >
>> >> > > The above-described topic is currently a PostgreSQL 9.6 open item.
>> >> > > Magnus,
>> >> > > since you committed the patch believed to have created it, you own
>> >> > > this
>> >> > > open
>> >> > > item.  If that responsibility lies elsewhere, please let us know
>> >> > > whose
>> >> > > responsibility it is to fix this.  Since new open items may be
>> >> > > discovered
>> >> > > at
>> >> > > any time and I want to plan to have them all fixed well in advance
>> >> > > of
>> >> > > the
>> >> > > ship
>> >> > > date, I will appreciate your efforts toward speedy resolution.
>> >> > > Please
>> >> > > present, within 72 hours, a plan to fix the defect within seven
>> >> > > days
>> >> > > of
>> >> > > this
>> >> > > message.  Thanks.
>> >> > >
>> >> >
>> >> > I won't have time to do the bigger rewrite/reordeirng by then, but I
>> >> > can
>> >> > certainly commit to having the smaller updates done to cover the new
>> >> > functionality in less than a week. If nothing else, that'll be
>> >> > something
>> >> > for me to do on the flight over to pgconf.us.
>> >>
>> >> Thanks for that plan; it sounds good.
>> >
>> >
>> > Here's a suggested patch.
>> >
>> > There is some duplication between the non-exclusive and exclusive backup
>> > sections, but I wanted to make sure that each set of instructions can
>> > just
>> > be followed top-to-bottom.
>> >
>> > I've also removed some tips that aren't really necessary as part of the
>> > step-by-step instructions in order to keep things from exploding in
>> > size.
>> >
>> > Finally, I've changed references to "backup dump" to just be "backup",
>> > because it's confusing to call them something with dumps in when it's
>> > not
>> > pg_dump. Enough that I got partially confused myself while editing...
>> >
>> > Comments?
>>
>> +    Low level base backups can be made in a non-exclusive or an exclusive
>> +    way. The non-exclusive method is recommended and the exclusive one
>> will
>> +    at some point be deprecated and removed.
>>
>> I don't object to add a non-exclusive mode of low level backup,
>> but I disagree to mark an exclusive backup as deprecated at least
>> until we can alleviate some pains that a non-exclusive mode causes.
>
>
> Note that we have not marked them as deprecated. We're just giving warnings
> that they will be deprecated.
>
>>
>>
>> One example of the pain, in a non-exclusive backup, we need to keep
>> the IDLE connection which was used to execute pg_start_backup(),
>> until the end of backup. Of course a backup can take a very
>> long time. In this case the IDLE connection also needs to remain
>> for such a long time. If it's accidentally terminated (e.g., because
>> of IDLE connection), the backup fails and needs to be taken again
>> from the beginning.
>
>
>
> Yes, it's a bit annoying. But it's something you can handle. Unlike the
> problems that exist with an exclusive backup which you *cannot* handle from
> the backup script/commands.

I know many systems which are currently running well in production and
handling that problem of an exclusive backup. For example, they use
Pacemaker/Corosync to shared-disk HA configuration. PostgreSQL resource
agent for Pacemaker removes backup_label in the case of failover to
prevent the standby server from failing to recover from the crash.
This is not so neat solution, but they could live with the problem for
a long time, so far.

I'm NOT against migrating their backup scripts so that new method is used,
at the end. On the other hand, I think that we don't need to warn and rush
them to do that at least until new method will be polished.

>> Another pain in a non-exclusive backup is to have to execute both
>> pg_start_backup() and pg_stop_backup() on the same connection.
>
>
> Pretty sure that's the same one you just listed?

Yes, the sources of those pains are the same.

I wonder if it's better to store the backup state information in shared
memory, catalog, flat file or something instead of local memory
so that we can execute pg_stop_backup() in different session from that
pg_start_backup() is called. Maybe I'm missing something basic, but
why do those functions need to be called in the same session at all?

>> Please imagine the case where psql is used to execute those two
>> backup functions (I believe that there are many users who do this).
>> For example,
>>
>>     psql -c "SELECT pg_start_backup()"
>>     rsync, cp, tar, storage backup, or something
>>     psql -c "SELECT pg_stop_backup()"
>>
>> A non-exclusive backup breaks the above very simple steps because
>> two backup functions are executed on different connections.
>>
>> So, how should we modify the steps for a non-exclusive backup?
>
>
> You should at least not use psql in that way. You need to open psql in a
> pipe and drive it through that. Or use a more appropriate client.
>
>
>> Basically we need to pause psql after pg_start_backup(), signal it
>> to resume after the copy of database cluster is taken, and make
>> it execute pg_stop_backup(). I'm afraid that the backup script
>> will be complicated because of this pain of non-exclusive backup.
>
>
> Yes, if you insist on using psql - which is not a good tool for this - then
> yes. But you could for example make it a psql script that does something
> along
> SELECT pg_start_backup();
> \! rsync ...
> SELECT pg_stop_backup()
>
> Or as said above, drive it through a pipe instead. Or use an appropriate
> client library.

So, what's the appropriate client library for new backup method?
It's better to document that.

>
> The bottom line is yes, it makes it a bit worse. But it is something you can
> handle quite simply from your client. And wen you do, it will be *right*.
>
> You have no way at all to fix the problem if your server crashes during an
> exclusive backup, for example. No matter how you connect to the database,
> because that's not where that problem is.
>
>
>>
>>
>> +     The <function>pg_stop_backup</> will return one row with three
>> +     values. The second of these fields should be written to a file named
>> +     <filename>backup_label</> in the root directory of the backup. The
>> +     third field should be written to a file named
>> +     <filename>tablespace_map</> unless the field is empty.
>>
>> How should we write those two values to different files when
>> we execute pg_stop_backup() via psql? Whole output of
>> pg_stop_backup() should be written to a transient file and
>> it should be filtered and written to different two files by
>> using some Linux commands? This also seems to make the backup
>> script more complicated.
>
>
> Yes, you would have to do that. And yes, psql is not a good tool for this.
> So the solution is to not force yourself to use a tool that doesn't work
> well for this problem.

So it's better to document a tool which can work well or how to use psql
for new backup method (i.e., how to create backup_label and tablespace_map
files from the output of psql, of course it's also better to write
the example).

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Rename max_parallel_degree?
Next
From: Magnus Hagander
Date:
Subject: Re: Updated backup APIs for non-exclusive backups