Re: Remove Deprecated Exclusive Backup Mode - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: Remove Deprecated Exclusive Backup Mode
Date
Msg-id 20200701172321.GS3125@tamriel.snowman.net
Whole thread Raw
In response to Re: Remove Deprecated Exclusive Backup Mode  (Magnus Hagander <magnus@hagander.net>)
Responses Re: Remove Deprecated Exclusive Backup Mode
List pgsql-hackers
Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:
> On Wed, Jul 1, 2020 at 6:29 PM Stephen Frost <sfrost@snowman.net> wrote:
> > * Magnus Hagander (magnus@hagander.net) wrote:
> > > On Wed, Jul 1, 2020 at 2:47 PM Stephen Frost <sfrost@snowman.net> wrote:
> > > > This would presumably be for the exclusive API as a way to make it not
> > > > completely broken, maybe.
> > > >
> > > > If we wanted to try and make this work in a non-exclusive manner then
> > > > we'd need to do something like have the user save some info out at
> > > > pg_start_backup time that they then provide at pg_stop_backup time, so
> > > > we can match up the specific backup and write the appropriate WAL
> > > > message.
> > >
> > > We don't even need to make it an exclusive mode -- we can allow
> > > *nonexclusive* backups but remove the requirement to run start and stop
> > > backup in the same connection, which I believe is the problem that people
> > > have with the exclusive mode. So how about something like this:
> > >
> > > 1. Make pg_start_backup() return the backup label file as well. We can
> > add
> > > an extra column to the output without breaking backwards compatibility.
> > And
> > > since we have all the information for the file at pg_start_backup() time,
> > > the user can then write that into the backup. We clearly document that
> > this
> > > should *not* be written as "backup_label" in the data directory. We can
> > > even define what it should be instead.
> > >
> > > 2. Make pg_start_backup() also return a "cookie" value with a
> > > unique identifier if asked to run in "disconnectable mode". Store this
> > > identifier in shared state somewhere in the backend.
> >
> > Seems like we could possibly just make this be the WAL position the
> > backup started at, since we use that as the finishing location...?  I
> > get that users could screw up passing the value back in and get a
> > corrupted backup, but it'd avoid us having to stick anything in shared
> > memory, wouldn't it?
>
> Or, how about we require them to provide the backup label contents in its
> entirety? Which I believe does contain that WAL portion?  The downside of
> that is that it would be multiline which might screw with shellscripts.

This is getting a bit beyond the pale imv if we're talking about trying
to cater to software that can't manage to handle a multi-line result or
multi-line argument to a function.

> We would still need to stick *something* in there so we can keep track of
> when it's done. You should only be able to stop each backup once for
> example.  Otherwise, you'll end up with XLogCtl->Insert.nonExclusiveBackups
> being wrong if someone calls stop twice with the same cookie/wal location.

Hrmpf, yeah, I suppose we can't really avoid having to track it somehow,
in which case, yeah, we might as well just have some unique ID for each
that we then look up in shared memory (though that unique ID could be
the starting WAL if we wanted, but I don't suppose it particularly
matters if it's that or something else).  We'd need to have some
parameter like max_concurrent_backups or something tho, which I was
hoping to avoid.

> > > 5. Perhaps provide a row in pg_stat_progress_basebackup, or in it's own
> > > view, to show these "disconnectable mode backups"?
> > > 5b. In fact, maybe provide something like that for the current
> > > non-exclusive ones as well, in case people have hung sessions.
> >
> > Ideally we'd provide a way for external backup tools to update said
> > progress with information of their own like the % complete, if they wish
> > to...
>
> That's definitely moving the goalposts on this several miles :) Not saying
> that wouldn't be nice, but let's keep those separate :P

Well, you brought up progress monitoring. :)

> > The weak spot in this is still the "don't write it as backup label", but
> > we
> > > can document that. And that would allow us to do non-exclusive base
> > backups
> > > without requiring maintaining the connection. And we can completely get
> > rid
> > > of the exclusive ones.
> >
> > If this helps us get rid of exclusive backup mode then that's certainly
> > helpful, though inventing yet another method of doing backup makes me
> > cringe to think of the documentation complexity since it doesn't seem
> > like we'd really be reducing that.  Now, if we also removed the existing
> > non-exclusive method and then had a *single* backup method with clear
> > documentation as to how to use it properly, that'd be a marked
> > improvement overall.
>
> We could do that, but the passing of cookies or whatever is completely
> unnecessary with the current method so it would complicate that one...

No, it wouldn't complicate "that one" since "that one" would be gone.
Yes, it'd be a slightly more complicated process than what they handle
currently but I have confidence that everything that's, today, arranging
to keep a PG connection open would be able to handle instead (or, in
addition to, if they wanted to continue having that connection anyway)
passing the cookie through from start->stop backup, and I do think it
would massively simplify the documentation which would make it well
worth it.

Thanks,

Stephen

Attachment

pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: max_slot_wal_keep_size and wal_keep_segments
Next
From: Tom Lane
Date:
Subject: Re: proposal - plpgsql - FOR over unbound cursor