Thread: Recovery target 'immediate'

Recovery target 'immediate'

From
Heikki Linnakangas
Date:
I just found out that if you use continuous archiving and online 
backups, it's surprisingly difficult to restore a backup, without 
replaying any more WAL than necessary.

If you don't set a recovery target, PostgreSQL will recover all the WAL 
it finds. You can set recovery target time to a point immediately after 
the end-of-backup record, but that's tricky. You have to somehow find 
out the exact time when the backup ended, and set it to that. But if you 
set it any too early, recovery will abort with "requested recovery stop 
point is before consistent recovery point" error. And that's not quite 
precise anyway; not all record types carry timestamps, so you will 
always replay a few extra records until the first timestamped record 
comes along. Setting recovery_target_xid is similarly difficult. If you 
were well prepared, you created a named recovery point with 
pg_create_restore_point() immediately after the backup ended, and you 
can use that, but that requires forethought.

It seems that we're missing a setting, something like recovery_target = 
'immediate', which would mean "stop as soon as consistency is reached". 
Or am I missing some trick?

- Heikki



Re: Recovery target 'immediate'

From
Robert Haas
Date:
On Thu, Apr 18, 2013 at 2:11 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> I just found out that if you use continuous archiving and online backups,
> it's surprisingly difficult to restore a backup, without replaying any more
> WAL than necessary.
>
> If you don't set a recovery target, PostgreSQL will recover all the WAL it
> finds. You can set recovery target time to a point immediately after the
> end-of-backup record, but that's tricky. You have to somehow find out the
> exact time when the backup ended, and set it to that. But if you set it any
> too early, recovery will abort with "requested recovery stop point is before
> consistent recovery point" error. And that's not quite precise anyway; not
> all record types carry timestamps, so you will always replay a few extra
> records until the first timestamped record comes along. Setting
> recovery_target_xid is similarly difficult. If you were well prepared, you
> created a named recovery point with pg_create_restore_point() immediately
> after the backup ended, and you can use that, but that requires forethought.
>
> It seems that we're missing a setting, something like recovery_target =
> 'immediate', which would mean "stop as soon as consistency is reached". Or
> am I missing some trick?

You know, I've been wondering for years how you're supposed to do
this.  Huge +1 for adding something like this, if it doesn't exist
already.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Recovery target 'immediate'

From
Fujii Masao
Date:
On Fri, Apr 19, 2013 at 10:30 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 18, 2013 at 2:11 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>> I just found out that if you use continuous archiving and online backups,
>> it's surprisingly difficult to restore a backup, without replaying any more
>> WAL than necessary.
>>
>> If you don't set a recovery target, PostgreSQL will recover all the WAL it
>> finds. You can set recovery target time to a point immediately after the
>> end-of-backup record, but that's tricky. You have to somehow find out the
>> exact time when the backup ended, and set it to that. But if you set it any
>> too early, recovery will abort with "requested recovery stop point is before
>> consistent recovery point" error. And that's not quite precise anyway; not
>> all record types carry timestamps, so you will always replay a few extra
>> records until the first timestamped record comes along. Setting
>> recovery_target_xid is similarly difficult. If you were well prepared, you
>> created a named recovery point with pg_create_restore_point() immediately
>> after the backup ended, and you can use that, but that requires forethought.
>>
>> It seems that we're missing a setting, something like recovery_target =
>> 'immediate', which would mean "stop as soon as consistency is reached". Or
>> am I missing some trick?
>
> You know, I've been wondering for years how you're supposed to do
> this.  Huge +1 for adding something like this, if it doesn't exist
> already.

I also don't know good way to do that. +1

Regards,

-- 
Fujii Masao



Re: Recovery target 'immediate'

From
Jaime Casanova
Date:
On Fri, Apr 19, 2013 at 8:30 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 18, 2013 at 2:11 PM, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>>
>> It seems that we're missing a setting, something like recovery_target =
>> 'immediate', which would mean "stop as soon as consistency is reached". Or
>> am I missing some trick?
>
> You know, I've been wondering for years how you're supposed to do
> this.  Huge +1 for adding something like this, if it doesn't exist
> already.
>

Hi,

you can use pause_at_recovery_target parameter in recovery.conf and
try one recovery_target at a time... or of course create a
pause_at_recovery_consistency (name could be different) for that


--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566         Cell: +593 987171157



Re: Recovery target 'immediate'

From
Sergey Burladyan
Date:
On Thu, Apr 18, 2013 at 10:11 PM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
I just found out that if you use continuous archiving and online backups, it's surprisingly difficult to restore a backup, without replaying any more WAL than necessary.

You can find first WAL file name in backup_label "START WAL LOCATION". Last WAL file name location depends on source type, if backup from slave - use pg_control from backup and "Minimum recovery ending location", if backup from master - use "STOP WAL LOCATION" from backup .history file :-) Then I just copy needed WALs from archive into pg_xlog and remove recovery.conf.

It seems that we're missing a setting, something like recovery_target = 'immediate', which would mean "stop as soon as consistency is reached". Or am I missing some trick?

This will be helpful :)

--
Sergey Burladyan

Re: Recovery target 'immediate'

From
Michael Paquier
Date:



On Fri, Apr 19, 2013 at 3:11 AM, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
I just found out that if you use continuous archiving and online backups, it's surprisingly difficult to restore a backup, without replaying any more WAL than necessary.

If you don't set a recovery target, PostgreSQL will recover all the WAL it finds. You can set recovery target time to a point immediately after the end-of-backup record, but that's tricky. You have to somehow find out the exact time when the backup ended, and set it to that. But if you set it any too early, recovery will abort with "requested recovery stop point is before consistent recovery point" error. And that's not quite precise anyway; not all record types carry timestamps, so you will always replay a few extra records until the first timestamped record comes along. Setting recovery_target_xid is similarly difficult. If you were well prepared, you created a named recovery point with pg_create_restore_point() immediately after the backup ended, and you can use that, but that requires forethought.

It seems that we're missing a setting, something like recovery_target = 'immediate', which would mean "stop as soon as consistency is reached". Or am I missing some trick?
+1. This will be really helpful. I don't know either of any good way to stop immediately after a consistent point now without tricking a target just after the end of backup.
--
Michael

Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 18 April 2013 19:11, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

> I just found out that if you use continuous archiving and online backups,
> it's surprisingly difficult to restore a backup, without replaying any more
> WAL than necessary.

I didn't add it myself because I don't see the need, if we think more carefully.

Why would you want your recovery end time to be governed solely by the
time that the *backup* ended? How can that have any bearing on what
you want at recovery time? If you have access to more WAL data, why
would you not apply them as well - unless you have some specific
reason not to - i.e. an incorrect xid or known problem time?

If you're storing only a few of the WAL files with the backup then it
will end naturally without assistance when the last file runs out.
What is the difference between stopping at an exact point in WAL half
way through a file and ending at the end of the file? If the end point
is arbitrary, why the need to specify it so closely?

I can't see a time when I have access to more WAL files *and* I want
to stop early at some imprecise point. But you could write a
restore_command script that stopped after a specific file forcing
recovery to end.

I don't think we should add a feature that encourages the belief that
it makes sense (because its approved by the developers) to stop
recovery at an arbitrary point, deliberately discarding user data.
That just encourages sysadmins to not communicate with
business/management about the exact details of a recovery.

So -1, given it doesn't seem to make sense anyway, but if it did there
are already 2 ways of stopping at an arbitrary point.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Heikki Linnakangas
Date:
On 26.04.2013 12:16, Simon Riggs wrote:
> On 18 April 2013 19:11, Heikki Linnakangas<hlinnakangas@vmware.com>  wrote:
>
>> I just found out that if you use continuous archiving and online backups,
>> it's surprisingly difficult to restore a backup, without replaying any more
>> WAL than necessary.
>
> I didn't add it myself because I don't see the need, if we think more carefully.
>
> Why would you want your recovery end time to be governed solely by the
> time that the *backup* ended? How can that have any bearing on what
> you want at recovery time? If you have access to more WAL data, why
> would you not apply them as well - unless you have some specific
> reason not to - i.e. an incorrect xid or known problem time?
>
> If you're storing only a few of the WAL files with the backup then it
> will end naturally without assistance when the last file runs out.
> What is the difference between stopping at an exact point in WAL half
> way through a file and ending at the end of the file? If the end point
> is arbitrary, why the need to specify it so closely?
>
> I can't see a time when I have access to more WAL files *and* I want
> to stop early at some imprecise point. But you could write a
> restore_command script that stopped after a specific file forcing
> recovery to end.

Well, I ran into this with VMware's Data Director, which manages backups 
among other things. In a typical setup, you have a WAL archive, and 
every now and then (daily, typically) a full backup is taken. Full 
backups are retained for some time, like a few weeks or months. The user 
can also manually request a full backup to be taken at any time.

There is an option to perform PITR. The system figures out the latest 
full backup that precedes the chosen point-in-time, sets 
recovery_target_time, and starts up Postgres. But there is also an 
operation to simply "restore a backup". The idea of that is to, well, 
restore to the chosen backup, and nothing more. In most cases, it 
probably wouldn't hurt if a one or two extra WAL files are replayed 
beyond the backup end time, but you certainly don't want to replay all 
the history. Yes, you could set recovery_target_time to the point where 
the backup ended, but that's complicated. You'd have to read the 
end-of-backup timestamp from the backup history file. And because 
timestamps are always a bit fuzzy, I think you'd have to add at least a 
few seconds to that to be sure.

To illustrate why it would be bad to replay more WAL than necessary, 
imagine that the user is about to perform some dangerous action he might 
want to undo later. For example, he's about to purge old data that isn't 
needed anymore, so with "DELETE FROM data WHERE year <= '2010'". The 
first thing he does is to take a backup with label 
"before-purging-2010". Immediately after the backup has finished, he 
performs the deletion. Now, the application stops working because it 
actually still needs the data, so he restores from the backup. If 
recovery decides to replay a few more WAL files after the end-of-backup, 
that could include the deletion, and that's no good.

One solution is to create restore point after the backup ends. Then you 
have a clearly defined point in time you can restore to. But it would be 
convenient to not have to do that. Or another way to think of this is 
that it would be convenient if there was an implicit restore point at 
the end of each backup.

- Heikki



Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 26 April 2013 11:29, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

> But there is also an operation to simply "restore a backup".

Just because a tool supports an imprecise definition of a restore,
doesn't mean Postgres should encourage and support that.

"Restore a backup" is more suited to filesystems where most files
don't change much. And its also a common user complaint: "I restored
my back but now I've lost my changes. Can you help?". That's not
something that's been heard around here because we don't encourage
foot-guns.

> One solution is to create restore point after the backup ends. Then you have
> a clearly defined point in time you can restore to. But it would be
> convenient to not have to do that. Or another way to think of this is that
> it would be convenient if there was an implicit restore point at the end of
> each backup.

If we were going to solve that problem, that would be the way to do it.

But then we could also solve other similar problems. Like queries that
run for a long time. We could just have them end after a specific time
rather than run to completion and give a correct answer. We could skip
joins that look difficult as well. After all "Run Query" wasn't a very
precise definition of what the user wanted, so what's wrong with a
taking a more relaxed attutude to query execution? They will
appreciate the performance gain, after all.

Precision and doing the safe thing are what people trust us to do.

I recognise this as a common request from users, I just don't think we
should add an option to Postgres to support this when imprecise
recovery is already supported by external means for those that take
the conscious decision to do things that way.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Magnus Hagander
Date:
On Fri, Apr 26, 2013 at 1:47 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 26 April 2013 11:29, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
>
>> But there is also an operation to simply "restore a backup".
>
> Just because a tool supports an imprecise definition of a restore,
> doesn't mean Postgres should encourage and support that.
>
> "Restore a backup" is more suited to filesystems where most files
> don't change much. And its also a common user complaint: "I restored
> my back but now I've lost my changes. Can you help?". That's not
> something that's been heard around here because we don't encourage
> foot-guns.

I think it makes perfect sense to have this. Since we do guarantee it
to still be consistent even if things *are* changing around. The lack
of an easy way to do this is probably the most common reason I've seen
for people using pg_dump instead of physical backups in the past.
pg_basebackup fixed it for the backup side of things, with the -x
option. This appears to be a suggestion to do that kind of restore
even if you have a log archive style backups.

That said, maybe the easier choice for a *system* (such as v-thingy)
would be to simply to the full backup using pg_basebackup -x (or
similar), therefor not needing the log archive at all when restoring.
Yes, it makes the base backup slightly larger, but also much
simpler... As a bonus, your base backup would still work if you hosed
your log archive.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 26 April 2013 12:54, Magnus Hagander <magnus@hagander.net> wrote:

> That said, maybe the easier choice for a *system* (such as v-thingy)
> would be to simply to the full backup using pg_basebackup -x (or
> similar), therefor not needing the log archive at all when restoring.
> Yes, it makes the base backup slightly larger, but also much
> simpler... As a bonus, your base backup would still work if you hosed
> your log archive.

Good point. My comments also apply there.

I think we should put a clear health warning on that to explain what
you get and don't get.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Heikki Linnakangas
Date:
On 26.04.2013 14:54, Magnus Hagander wrote:
> On Fri, Apr 26, 2013 at 1:47 PM, Simon Riggs<simon@2ndquadrant.com>  wrote:
>> On 26 April 2013 11:29, Heikki Linnakangas<hlinnakangas@vmware.com>  wrote:
>>
>>> But there is also an operation to simply "restore a backup".
>>
>> Just because a tool supports an imprecise definition of a restore,
>> doesn't mean Postgres should encourage and support that.
>>
>> "Restore a backup" is more suited to filesystems where most files
>> don't change much. And its also a common user complaint: "I restored
>> my back but now I've lost my changes. Can you help?". That's not
>> something that's been heard around here because we don't encourage
>> foot-guns.
>
> I think it makes perfect sense to have this. Since we do guarantee it
> to still be consistent even if things *are* changing around. The lack
> of an easy way to do this is probably the most common reason I've seen
> for people using pg_dump instead of physical backups in the past.
> pg_basebackup fixed it for the backup side of things, with the -x
> option. This appears to be a suggestion to do that kind of restore
> even if you have a log archive style backups.
>
> That said, maybe the easier choice for a *system* (such as v-thingy)
> would be to simply to the full backup using pg_basebackup -x (or
> similar), therefor not needing the log archive at all when restoring.

Even if you have all the required WAL files included in the backup, 
you'll still want to use a restore_command that can restore timeline 
history files from the archive (I found this out the hard way). 
Otherwise Postgres won't see the existing timeline history files, and 
can choose a timeline ID that's already in use. That will cause 
confusion after recovery when files generated on the new timeline start 
to be archived; they will clash with files from the "other" timeline 
with the same TLI. You can work around that by with a restore_command 
that returns false for regular WAL files, but restores timeline history 
files normally. But that's inconvenient again; it's not trivial to 
formulate such a restore_command.

Also, pg_basebackup is a lot less efficient than working straight with 
the filesystem. It's a very convenient stand-alone backup tool, but if 
you're writing a backup handling system, you'll want to use something 
more efficient. (Data Director uses disk snapshots, as it happens)

- Heikki



Re: Recovery target 'immediate'

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> That said, maybe the easier choice for a *system* (such as v-thingy)
> would be to simply to the full backup using pg_basebackup -x (or
> similar), therefor not needing the log archive at all when restoring.
> Yes, it makes the base backup slightly larger, but also much
> simpler... As a bonus, your base backup would still work if you hosed
> your log archive.

It doesn't appear to me that that resolves Heikki's complaint: if you
recover from such a backup, the state that you get is still rather vague
no?  The system will replay to the end of whichever WAL file it last
copied.

I think it'd be a great idea to ensure that pg_stop_backup creates a
well defined restore stop point that corresponds to some instant during
the execution of pg_stop_backup.  Obviously, if other sessions are
changing the database state meanwhile, it's impossible to pin it down
more precisely than that; but I think this would satisfy the principle
of least astonishment, and it's not clear that what we have now does.
        regards, tom lane



Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 26 April 2013 14:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> That said, maybe the easier choice for a *system* (such as v-thingy)
>> would be to simply to the full backup using pg_basebackup -x (or
>> similar), therefor not needing the log archive at all when restoring.
>> Yes, it makes the base backup slightly larger, but also much
>> simpler... As a bonus, your base backup would still work if you hosed
>> your log archive.
>
> It doesn't appear to me that that resolves Heikki's complaint: if you
> recover from such a backup, the state that you get is still rather vague
> no?  The system will replay to the end of whichever WAL file it last
> copied.
>
> I think it'd be a great idea to ensure that pg_stop_backup creates a
> well defined restore stop point that corresponds to some instant during
> the execution of pg_stop_backup.  Obviously, if other sessions are
> changing the database state meanwhile, it's impossible to pin it down
> more precisely than that; but I think this would satisfy the principle
> of least astonishment, and it's not clear that what we have now does.

Restore points are definitely the way to go here, this is what they
were created for. Stopping at a labelled location has a defined
meaning for the user, which is much better than just "stop anywhere
convenient", which I found so frightening.

It should be straightforward to create a restore point with the same
name as used in pg_start_backup('text');

pg_basebackup backups would need to use a unique key, which is harder
to achieve. If we write a WAL record at backup start that would make
the starting LSN unique, so we could then use that for the restore
point name for that backup.

If people want anything else they can request an additional restore
point at the end of the backup.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Robert Haas
Date:
On Fri, Apr 26, 2013 at 10:05 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Restore points are definitely the way to go here, this is what they
> were created for. Stopping at a labelled location has a defined
> meaning for the user, which is much better than just "stop anywhere
> convenient", which I found so frightening.
>
> It should be straightforward to create a restore point with the same
> name as used in pg_start_backup('text');
>
> pg_basebackup backups would need to use a unique key, which is harder
> to achieve. If we write a WAL record at backup start that would make
> the starting LSN unique, so we could then use that for the restore
> point name for that backup.
>
> If people want anything else they can request an additional restore
> point at the end of the backup.

I personally find this to be considerably more error-prone than
Heikki's suggestion.  On the occasions when I have had the dubious
pleasure of trying to do PITR recovery, it's quite easy to supply a
recovery target that never actually gets matched - and then you
accidentally recover all the way to the end of WAL.  This is not fun.
Having a bulletproof way to say "recover until you reach consistency
and then stop" is a much nicer API.  I don't think "stop as soon as
possible" is at all the same thing as "stop anywhere convenient".

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Recovery target 'immediate'

From
Magnus Hagander
Date:
<p dir="ltr"><br /> On Apr 26, 2013 4:38 PM, "Robert Haas" <<a
href="mailto:robertmhaas@gmail.com">robertmhaas@gmail.com</a>>wrote:<br /> ><br /> > On Fri, Apr 26, 2013 at
10:05AM, Simon Riggs <<a href="mailto:simon@2ndquadrant.com">simon@2ndquadrant.com</a>> wrote:<br /> > >
Restorepoints are definitely the way to go here, this is what they<br /> > > were created for. Stopping at a
labelledlocation has a defined<br /> > > meaning for the user, which is much better than just "stop anywhere<br
/>> > convenient", which I found so frightening.<br /> > ><br /> > > It should be straightforward to
createa restore point with the same<br /> > > name as used in pg_start_backup('text');<br /> > ><br /> >
>pg_basebackup backups would need to use a unique key, which is harder<br /> > > to achieve. If we write a WAL
recordat backup start that would make<br /> > > the starting LSN unique, so we could then use that for the
restore<br/> > > point name for that backup.<br /> > ><br /> > > If people want anything else they
canrequest an additional restore<br /> > > point at the end of the backup.<br /> ><br /> > I personally
findthis to be considerably more error-prone than<br /> > Heikki's suggestion.  On the occasions when I have had the
dubious<br/> > pleasure of trying to do PITR recovery, it's quite easy to supply a<br /> > recovery target that
neveractually gets matched - and then you<br /> > accidentally recover all the way to the end of WAL.  This is not
fun.<br/> > Having a bulletproof way to say "recover until you reach consistency<br /> > and then stop" is a much
nicerAPI.  I don't think "stop as soon as<br /> > possible" is at all the same thing as "stop anywhere
convenient".<br/> ><p dir="ltr">Thinking some more about it, this could also be useful together with pausing at the
recoverytarget to get a quick look at the state of things before recovering further. I assume that would work as well,
sinceit would be "a recovery target like the others".. <p dir="ltr">/Magnus  

Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 26 April 2013 15:38, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Apr 26, 2013 at 10:05 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Restore points are definitely the way to go here, this is what they
>> were created for. Stopping at a labelled location has a defined
>> meaning for the user, which is much better than just "stop anywhere
>> convenient", which I found so frightening.
>>
>> It should be straightforward to create a restore point with the same
>> name as used in pg_start_backup('text');
>>
>> pg_basebackup backups would need to use a unique key, which is harder
>> to achieve. If we write a WAL record at backup start that would make
>> the starting LSN unique, so we could then use that for the restore
>> point name for that backup.
>>
>> If people want anything else they can request an additional restore
>> point at the end of the backup.
>
> I personally find this to be considerably more error-prone than
> Heikki's suggestion.

Given that I was describing how we might implement Heikki's
suggestion, I find this comment confusing.

Please explain.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Robert Haas
Date:
On Fri, Apr 26, 2013 at 11:35 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Given that I was describing how we might implement Heikki's
> suggestion, I find this comment confusing.
>
> Please explain.

Heikki's suggestion is simply to have a mode that stops as soon as
consistency is reached.  The server already knows (from the backup
label) what the consistency point is, so there's no need to add a
restore point or anything else to the WAL stream to implement what
he's talking about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 26 April 2013 16:38, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Apr 26, 2013 at 11:35 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Given that I was describing how we might implement Heikki's
>> suggestion, I find this comment confusing.
>>
>> Please explain.
>
> Heikki's suggestion is simply to have a mode that stops as soon as
> consistency is reached.  The server already knows (from the backup
> label) what the consistency point is, so there's no need to add a
> restore point or anything else to the WAL stream to implement what
> he's talking about.

Using restore points just puts into use the facility that is already
best practice to use, put there for just this kind of situation.
I guess you could do recovery_target_name = '$consistent'

Doing it the other way means you need to add a new kind of recovery
target to the API just for this.
recovery_target_immediate = on

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Heikki Linnakangas
Date:
On 26.04.2013 19:05, Simon Riggs wrote:
> On 26 April 2013 16:38, Robert Haas<robertmhaas@gmail.com>  wrote:
>> On Fri, Apr 26, 2013 at 11:35 AM, Simon Riggs<simon@2ndquadrant.com>  wrote:
>>> Given that I was describing how we might implement Heikki's
>>> suggestion, I find this comment confusing.
>>>
>>> Please explain.
>>
>> Heikki's suggestion is simply to have a mode that stops as soon as
>> consistency is reached.  The server already knows (from the backup
>> label) what the consistency point is, so there's no need to add a
>> restore point or anything else to the WAL stream to implement what
>> he's talking about.
>
> Using restore points just puts into use the facility that is already
> best practice to use, put there for just this kind of situation.
> I guess you could do recovery_target_name = '$consistent'
>
> Doing it the other way means you need to add a new kind of recovery
> target to the API just for this.
> recovery_target_immediate = on

Sounds good to me.

Actually, from a usability point of view I think would be nice to have 
just one setting, "recovery_target". It's already somewhat confusing to 
have recovery_target_xid, recovery_target_time, and 
recovery_target_name, which are mutually exclusive, and 
recovery_target_inclusive which is just a modifier for the others. Maybe 
something like:

recovery_target = 'xid 1234'
recovery_target = 'xid 1234 exclusive'
recovery_target = '2013-04-22 12:33'
recovery_target = '2013-04-22 12:33 exclusive'
recovery_target = 'consistent'
recovery_target = 'name: daily backup'

- Heikki



Re: Recovery target 'immediate'

From
Robert Haas
Date:
On Fri, Apr 26, 2013 at 12:25 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
>> Doing it the other way means you need to add a new kind of recovery
>> target to the API just for this.
>> recovery_target_immediate = on
>
> Sounds good to me.

Yeah, I don't have a problem with that, at all.

> Actually, from a usability point of view I think would be nice to have just
> one setting, "recovery_target". It's already somewhat confusing to have
> recovery_target_xid, recovery_target_time, and recovery_target_name, which
> are mutually exclusive, and recovery_target_inclusive which is just a
> modifier for the others. Maybe something like:
>
> recovery_target = 'xid 1234'
> recovery_target = 'xid 1234 exclusive'
> recovery_target = '2013-04-22 12:33'
> recovery_target = '2013-04-22 12:33 exclusive'
> recovery_target = 'consistent'
> recovery_target = 'name: daily backup'

I agree that the current API is confusing in exactly the way you
describe.  Whether this is an improvement, I'm not sure.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 26 April 2013 17:25, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> On 26.04.2013 19:05, Simon Riggs wrote:
>>
>> On 26 April 2013 16:38, Robert Haas<robertmhaas@gmail.com>  wrote:
>>>
>>> On Fri, Apr 26, 2013 at 11:35 AM, Simon Riggs<simon@2ndquadrant.com>
>>> wrote:
>>>>
>>>> Given that I was describing how we might implement Heikki's
>>>> suggestion, I find this comment confusing.
>>>>
>>>> Please explain.
>>>
>>>
>>> Heikki's suggestion is simply to have a mode that stops as soon as
>>> consistency is reached.  The server already knows (from the backup
>>> label) what the consistency point is, so there's no need to add a
>>> restore point or anything else to the WAL stream to implement what
>>> he's talking about.
>>
>>
>> Using restore points just puts into use the facility that is already
>> best practice to use, put there for just this kind of situation.
>> I guess you could do recovery_target_name = '$consistent'
>>
>> Doing it the other way means you need to add a new kind of recovery
>> target to the API just for this.
>> recovery_target_immediate = on
>
>
> Sounds good to me.
>
> Actually, from a usability point of view I think would be nice to have just
> one setting, "recovery_target". It's already somewhat confusing to have
> recovery_target_xid, recovery_target_time, and recovery_target_name, which
> are mutually exclusive, and recovery_target_inclusive which is just a
> modifier for the others. Maybe something like:
>
> recovery_target = 'xid 1234'
> recovery_target = 'xid 1234 exclusive'
> recovery_target = '2013-04-22 12:33'
> recovery_target = '2013-04-22 12:33 exclusive'
> recovery_target = 'consistent'
> recovery_target = 'name: daily backup'

So now you want to change the whole existing API so it fits with your
one new requirement?

Sounds like flamebait to me, but -1, just in case you're serious.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Magnus Hagander
Date:
On Fri, Apr 26, 2013 at 6:43 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 26 April 2013 17:25, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
>> On 26.04.2013 19:05, Simon Riggs wrote:
>>>
>>> On 26 April 2013 16:38, Robert Haas<robertmhaas@gmail.com>  wrote:
>>>>
>>>> On Fri, Apr 26, 2013 at 11:35 AM, Simon Riggs<simon@2ndquadrant.com>
>>>> wrote:
>>>>>
>>>>> Given that I was describing how we might implement Heikki's
>>>>> suggestion, I find this comment confusing.
>>>>>
>>>>> Please explain.
>>>>
>>>>
>>>> Heikki's suggestion is simply to have a mode that stops as soon as
>>>> consistency is reached.  The server already knows (from the backup
>>>> label) what the consistency point is, so there's no need to add a
>>>> restore point or anything else to the WAL stream to implement what
>>>> he's talking about.
>>>
>>>
>>> Using restore points just puts into use the facility that is already
>>> best practice to use, put there for just this kind of situation.
>>> I guess you could do recovery_target_name = '$consistent'
>>>
>>> Doing it the other way means you need to add a new kind of recovery
>>> target to the API just for this.
>>> recovery_target_immediate = on
>>
>>
>> Sounds good to me.
>>
>> Actually, from a usability point of view I think would be nice to have just
>> one setting, "recovery_target". It's already somewhat confusing to have
>> recovery_target_xid, recovery_target_time, and recovery_target_name, which
>> are mutually exclusive, and recovery_target_inclusive which is just a
>> modifier for the others. Maybe something like:
>>
>> recovery_target = 'xid 1234'
>> recovery_target = 'xid 1234 exclusive'
>> recovery_target = '2013-04-22 12:33'
>> recovery_target = '2013-04-22 12:33 exclusive'
>> recovery_target = 'consistent'
>> recovery_target = 'name: daily backup'
>
> So now you want to change the whole existing API so it fits with your
> one new requirement?

I like that newer API suggestion better than what we have now - though
it can perhaps be improved even more. But I definitely don't think
it's worth breaking backwards compatibility for it. There are lots of
tools and scripts and whatnot out there that use the current API. I
think we need a bigger improvement than just a cleaner syntax to break
those.


--Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Recovery target 'immediate'

From
Heikki Linnakangas
Date:
On 26.04.2013 19:50, Magnus Hagander wrote:
> On Fri, Apr 26, 2013 at 6:43 PM, Simon Riggs<simon@2ndquadrant.com>  wrote:
>> On 26 April 2013 17:25, Heikki Linnakangas<hlinnakangas@vmware.com>  wrote:
>>> Actually, from a usability point of view I think would be nice to have just
>>> one setting, "recovery_target". It's already somewhat confusing to have
>>> recovery_target_xid, recovery_target_time, and recovery_target_name, which
>>> are mutually exclusive, and recovery_target_inclusive which is just a
>>> modifier for the others. Maybe something like:
>>>
>>> recovery_target = 'xid 1234'
>>> recovery_target = 'xid 1234 exclusive'
>>> recovery_target = '2013-04-22 12:33'
>>> recovery_target = '2013-04-22 12:33 exclusive'
>>> recovery_target = 'consistent'
>>> recovery_target = 'name: daily backup'
>>
>> So now you want to change the whole existing API so it fits with your
>> one new requirement?

No, I think the above would be a usability improvement whether or not we 
add the new feature.

> I like that newer API suggestion better than what we have now - though
> it can perhaps be improved even more. But I definitely don't think
> it's worth breaking backwards compatibility for it. There are lots of
> tools and scripts and whatnot out there that use the current API. I
> think we need a bigger improvement than just a cleaner syntax to break
> those.

It would be possible to do it in a backwards-compatible way, keeping the 
old API as is.  But yeah, might not be worth the effort.

- Heikki



Re: Recovery target 'immediate'

From
Bruce Momjian
Date:
On Fri, Apr 26, 2013 at 09:48:48AM -0400, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > That said, maybe the easier choice for a *system* (such as v-thingy)
> > would be to simply to the full backup using pg_basebackup -x (or
> > similar), therefor not needing the log archive at all when restoring.
> > Yes, it makes the base backup slightly larger, but also much
> > simpler... As a bonus, your base backup would still work if you hosed
> > your log archive.
> 
> It doesn't appear to me that that resolves Heikki's complaint: if you
> recover from such a backup, the state that you get is still rather vague
> no?  The system will replay to the end of whichever WAL file it last
> copied.
> 
> I think it'd be a great idea to ensure that pg_stop_backup creates a
> well defined restore stop point that corresponds to some instant during
> the execution of pg_stop_backup.  Obviously, if other sessions are
> changing the database state meanwhile, it's impossible to pin it down
> more precisely than that; but I think this would satisfy the principle
> of least astonishment, and it's not clear that what we have now does.

Should I add this as a TODO item?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: Recovery target 'immediate'

From
Michael Paquier
Date:



On Thu, May 2, 2013 at 7:40 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Apr 26, 2013 at 09:48:48AM -0400, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > That said, maybe the easier choice for a *system* (such as v-thingy)
> > would be to simply to the full backup using pg_basebackup -x (or
> > similar), therefor not needing the log archive at all when restoring.
> > Yes, it makes the base backup slightly larger, but also much
> > simpler... As a bonus, your base backup would still work if you hosed
> > your log archive.
>
> It doesn't appear to me that that resolves Heikki's complaint: if you
> recover from such a backup, the state that you get is still rather vague
> no?  The system will replay to the end of whichever WAL file it last
> copied.
>
> I think it'd be a great idea to ensure that pg_stop_backup creates a
> well defined restore stop point that corresponds to some instant during
> the execution of pg_stop_backup.  Obviously, if other sessions are
> changing the database state meanwhile, it's impossible to pin it down
> more precisely than that; but I think this would satisfy the principle
> of least astonishment, and it's not clear that what we have now does.

Should I add this as a TODO item?
Definitely, it would make sense to note that somewhere.
Thanks!
--
Michael

Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 26 April 2013 18:13, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> On 26.04.2013 19:50, Magnus Hagander wrote:
>>
>> On Fri, Apr 26, 2013 at 6:43 PM, Simon Riggs<simon@2ndquadrant.com>
>> wrote:
>>>
>>> On 26 April 2013 17:25, Heikki Linnakangas<hlinnakangas@vmware.com>
>>> wrote:
>>>>
>>>> Actually, from a usability point of view I think would be nice to have
>>>> just
>>>>
>>>> one setting, "recovery_target". It's already somewhat confusing to have
>>>> recovery_target_xid, recovery_target_time, and recovery_target_name,
>>>> which
>>>> are mutually exclusive, and recovery_target_inclusive which is just a
>>>> modifier for the others. Maybe something like:
>>>>
>>>> recovery_target = 'xid 1234'
>>>> recovery_target = 'xid 1234 exclusive'
>>>> recovery_target = '2013-04-22 12:33'
>>>> recovery_target = '2013-04-22 12:33 exclusive'
>>>> recovery_target = 'consistent'
>>>> recovery_target = 'name: daily backup'
>>>
>>>
>>> So now you want to change the whole existing API so it fits with your
>>> one new requirement?
>
>
> No, I think the above would be a usability improvement whether or not we add
> the new feature.


I don't see the usability improvement. This is only being suggested to
make one new addition look cleaner; there isn't a common gripe that
the use of parameters is hard to use, other than their location and
the ability to treat them as GUCs.

This changes the existing API which will confuse people that know it
and invalidate everything written in software and on wikis as to how
to do it. That means all the "in case of fire break glass"
instructions are all wrong and need to be rewritten and retested.

It also introduces a single common datatype for such entries, where
before we had that xids were numbers, names were text, so this new
mechanism operates completely differently from all other GUC
parameters.

Plus its inconsistent, in that with xids you have 'xid 1234' whereas
timestamps just say '2013-04-22' rather than 'timestamp 2013-04-22',
or with names should they end in a colon or not. There'n no clear
differentiation between text for names and other keywords. Presumably
we'll need a complex parser to sort that out.

When we add a new feature that requires a new format, will we change
the whole format again to make that fit in also?

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Magnus Hagander
Date:
On Thu, May 2, 2013 at 8:55 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 26 April 2013 18:13, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
>> On 26.04.2013 19:50, Magnus Hagander wrote:
>>>
>>> On Fri, Apr 26, 2013 at 6:43 PM, Simon Riggs<simon@2ndquadrant.com>
>>> wrote:
>>>>
>>>> On 26 April 2013 17:25, Heikki Linnakangas<hlinnakangas@vmware.com>
>>>> wrote:
>>>>>
>>>>> Actually, from a usability point of view I think would be nice to have
>>>>> just
>>>>>
>>>>> one setting, "recovery_target". It's already somewhat confusing to have
>>>>> recovery_target_xid, recovery_target_time, and recovery_target_name,
>>>>> which
>>>>> are mutually exclusive, and recovery_target_inclusive which is just a
>>>>> modifier for the others. Maybe something like:
>>>>>
>>>>> recovery_target = 'xid 1234'
>>>>> recovery_target = 'xid 1234 exclusive'
>>>>> recovery_target = '2013-04-22 12:33'
>>>>> recovery_target = '2013-04-22 12:33 exclusive'
>>>>> recovery_target = 'consistent'
>>>>> recovery_target = 'name: daily backup'
>>>>
>>>>
>>>> So now you want to change the whole existing API so it fits with your
>>>> one new requirement?
>>
>>
>> No, I think the above would be a usability improvement whether or not we add
>> the new feature.
>
>
> I don't see the usability improvement. This is only being suggested to
> make one new addition look cleaner; there isn't a common gripe that
> the use of parameters is hard to use, other than their location and
> the ability to treat them as GUCs.

Actually, there is - I hear it quite often from people not so
experienced in PostgreSQL. Though in fairness, I'm not entirely sure
the new syntax would help - some of those need a tool to do it for
them, really (and such tools exist, I believe).

That said, there is one property that's very unclear now and that's
that you can only set one of recovery_target_time, recovery_target_xid
and recovery_target_name. But they can be freely combined with
recovery_target_timeline and recovery_target_inclusive. That's quite
confusing.



> This changes the existing API which will confuse people that know it
> and invalidate everything written in software and on wikis as to how
> to do it. That means all the "in case of fire break glass"
> instructions are all wrong and need to be rewritten and retested.

Yes, *that* is the main reason *not* to make the change. It has a
pretty bad cost in backwards compatibility loss. There is a gain, but
I don't think it outweighs the cost.



> It also introduces a single common datatype for such entries, where
> before we had that xids were numbers, names were text, so this new
> mechanism operates completely differently from all other GUC
> parameters.
>
> Plus its inconsistent, in that with xids you have 'xid 1234' whereas
> timestamps just say '2013-04-22' rather than 'timestamp 2013-04-22',
> or with names should they end in a colon or not. There'n no clear
> differentiation between text for names and other keywords. Presumably
> we'll need a complex parser to sort that out.

I'm assuming that was just typos in Heikki's example. I'm sure he
meant them to be consistent.

-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/



Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 2 May 2013 08:31, Magnus Hagander <magnus@hagander.net> wrote:

> That said, there is one property that's very unclear now and that's
> that you can only set one of recovery_target_time, recovery_target_xid
> and recovery_target_name. But they can be freely combined with
> recovery_target_timeline and recovery_target_inclusive. That's quite
> confusing.

In the docs we say "At most one of recovery_target_time,
recovery_target_name or recovery_target_xid can be specified." on each
of those parameter descriptions.

In recovery.conf.sample, we say
# You may set a recovery target either by transactionId, by name,
# or by timestamp. Recovery may either include or exclude the
# transaction(s) with the recovery target value (ie, stop either
# just after or just before the given target, respectively).

Who is confused by that? And if they are, why would they be less
confused with changed *syntax*? I think most people just copy the
examples anyway, they don't care about the syntax.

As we just saw, changing the syntax may introduce other consistency
issues and confusions that weren't there before. If the precise syntax
is the essence of a new and improved interface, surely it needs to be
fully worked out before anybody agrees.

It has always been the case that recovery is a complex topic and one
that is used in stressful circumstances. It isn't the syntax that
makes using this hard, its the fact that the process itself is
non-trivial and not easy to use without some prior thought and testing
of how recovery will work for a particular company/enterprise.

I'm very progressive about both new features and usability
improvements, but rearranging things for minor reasons just feels like
a waste.

If we feel strongly about user interface design problems we should
treat them the same way we treat performance issues. Profile to
identify problem areas, analyze problems in those areas and suggest
solutions, then make tests to check that the new interface genuinely
works better than the old. That is proper UI improvement, not just
knee jerk reactions.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Bruce Momjian
Date:
On Thu, May  2, 2013 at 09:04:20AM +0100, Simon Riggs wrote:
> If we feel strongly about user interface design problems we should
> treat them the same way we treat performance issues. Profile to
> identify problem areas, analyze problems in those areas and suggest
> solutions, then make tests to check that the new interface genuinely
> works better than the old. That is proper UI improvement, not just
> knee jerk reactions.

I am not sure if you are serious or now, but for me, email discussion is
sufficient.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: Recovery target 'immediate'

From
Bruce Momjian
Date:
On Thu, May  2, 2013 at 09:31:03AM +0200, Magnus Hagander wrote:
> Actually, there is - I hear it quite often from people not so
> experienced in PostgreSQL. Though in fairness, I'm not entirely sure
> the new syntax would help - some of those need a tool to do it for
> them, really (and such tools exist, I believe).
> 
> That said, there is one property that's very unclear now and that's
> that you can only set one of recovery_target_time, recovery_target_xid
> and recovery_target_name. But they can be freely combined with
> recovery_target_timeline and recovery_target_inclusive. That's quite
> confusing.
> 
> 
> 
> > This changes the existing API which will confuse people that know it
> > and invalidate everything written in software and on wikis as to how
> > to do it. That means all the "in case of fire break glass"
> > instructions are all wrong and need to be rewritten and retested.
> 
> Yes, *that* is the main reason *not* to make the change. It has a
> pretty bad cost in backwards compatibility loss. There is a gain, but
> I don't think it outweighs the cost.

So, is there a way to add this feature without breaking the API?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: Recovery target 'immediate'

From
Michael Paquier
Date:
<div dir="ltr"><br /><div class="gmail_extra"><br /><br /><div class="gmail_quote">On Fri, May 3, 2013 at 8:56 AM,
BruceMomjian <span dir="ltr"><<a href="mailto:bruce@momjian.us" target="_blank">bruce@momjian.us</a>></span>
wrote:<br/><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div
class="im">OnThu, May  2, 2013 at 09:31:03AM +0200, Magnus Hagander wrote:<br /> > Actually, there is - I hear it
quiteoften from people not so<br /> > experienced in PostgreSQL. Though in fairness, I'm not entirely sure<br />
>the new syntax would help - some of those need a tool to do it for<br /> > them, really (and such tools exist, I
believe).<br/> ><br /> > That said, there is one property that's very unclear now and that's<br /> > that you
canonly set one of recovery_target_time, recovery_target_xid<br /> > and recovery_target_name. But they can be
freelycombined with<br /> > recovery_target_timeline and recovery_target_inclusive. That's quite<br /> >
confusing.<br/> ><br /> ><br /> ><br /> > > This changes the existing API which will confuse people that
knowit<br /> > > and invalidate everything written in software and on wikis as to how<br /> > > to do it.
Thatmeans all the "in case of fire break glass"<br /> > > instructions are all wrong and need to be rewritten and
retested.<br/> ><br /> > Yes, *that* is the main reason *not* to make the change. It has a<br /> > pretty bad
costin backwards compatibility loss. There is a gain, but<br /> > I don't think it outweighs the cost.<br /><br
/></div>So,is there a way to add this feature without breaking the API?<br /></blockquote>Yes, by adding a new
parameterexclusively used to control this feature, something like recovery_target_immediate = 'on/off'.<br /></div>--
<br/>Michael<br /></div></div> 

Re: Recovery target 'immediate'

From
Cédric Villemain
Date:
Le vendredi 3 mai 2013 02:54:15, Michael Paquier a écrit :
> On Fri, May 3, 2013 at 8:56 AM, Bruce Momjian <bruce@momjian.us> wrote:
> > On Thu, May  2, 2013 at 09:31:03AM +0200, Magnus Hagander wrote:
> > > Actually, there is - I hear it quite often from people not so
> > > experienced in PostgreSQL. Though in fairness, I'm not entirely sure
> > > the new syntax would help - some of those need a tool to do it for
> > > them, really (and such tools exist, I believe).
> > >
> > > That said, there is one property that's very unclear now and that's
> > > that you can only set one of recovery_target_time, recovery_target_xid
> > > and recovery_target_name. But they can be freely combined with
> > > recovery_target_timeline and recovery_target_inclusive. That's quite
> > > confusing.
> > >
> > > > This changes the existing API which will confuse people that know it
> > > > and invalidate everything written in software and on wikis as to how
> > > > to do it. That means all the "in case of fire break glass"
> > > > instructions are all wrong and need to be rewritten and retested.
> > >
> > > Yes, *that* is the main reason *not* to make the change. It has a
> > > pretty bad cost in backwards compatibility loss. There is a gain, but
> > > I don't think it outweighs the cost.
> >
> > So, is there a way to add this feature without breaking the API?
>
> Yes, by adding a new parameter exclusively used to control this feature,
> something like recovery_target_immediate = 'on/off'.

We just need to add a named restore point when ending the backup (in
pg_stop_backup() ?).
No API change required. Just document that some predefined target names are set
during backup.
--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

Re: Recovery target 'immediate'

From
Bruce Momjian
Date:
On Fri, May  3, 2013 at 01:02:08PM +0200, Cédric Villemain wrote:
> > > > > This changes the existing API which will confuse people that know it
> > > > > and invalidate everything written in software and on wikis as to how
> > > > > to do it. That means all the "in case of fire break glass"
> > > > > instructions are all wrong and need to be rewritten and retested.
> > > > 
> > > > Yes, *that* is the main reason *not* to make the change. It has a
> > > > pretty bad cost in backwards compatibility loss. There is a gain, but
> > > > I don't think it outweighs the cost.
> > > 
> > > So, is there a way to add this feature without breaking the API?
> > 
> > Yes, by adding a new parameter exclusively used to control this feature,
> > something like recovery_target_immediate = 'on/off'.
> 
> We just need to add a named restore point when ending the backup (in 
> pg_stop_backup() ?).
> No API change required. Just document that some predefined target names are set 
> during backup.

So we auto-add a restore point based on the backup label.  Does that
work for everyone?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +



Re: Recovery target 'immediate'

From
Heikki Linnakangas
Date:
On 03.05.2013 16:29, Bruce Momjian wrote:
> On Fri, May  3, 2013 at 01:02:08PM +0200, Cédric Villemain wrote:
>>>>>> This changes the existing API which will confuse people that know it
>>>>>> and invalidate everything written in software and on wikis as to how
>>>>>> to do it. That means all the "in case of fire break glass"
>>>>>> instructions are all wrong and need to be rewritten and retested.
>>>>>
>>>>> Yes, *that* is the main reason *not* to make the change. It has a
>>>>> pretty bad cost in backwards compatibility loss. There is a gain, but
>>>>> I don't think it outweighs the cost.
>>>>
>>>> So, is there a way to add this feature without breaking the API?
>>>
>>> Yes, by adding a new parameter exclusively used to control this feature,
>>> something like recovery_target_immediate = 'on/off'.
>>
>> We just need to add a named restore point when ending the backup (in
>> pg_stop_backup() ?).
>> No API change required. Just document that some predefined target names are set
>> during backup.
>
> So we auto-add a restore point based on the backup label.  Does that
> work for everyone?

Unfortunately, no. There are cases where you want to stop right after
reaching consistency, but the point where you reach consistency is not
at the end of a backup. For example, if you take a backup using an
atomic filesystem snapshot, there are no pg_start/stop_backup calls, and
the system will reach consistency after replaying all the WAL in
pg_xlog. You might think that you can just not create a recovery.conf
file in that case, or create a dummy recovery.conf file with
restore_command='/bin/false'. However, then the system will not find the
existing timeline history files in the archive, and can pick a TLI
that's already in use. I found this out the hard way, and actually ended
up writing a restore_command that restore timeline history files
normally, but returns non-zero for any real other files; it wasn't pretty.

Another case is that you take a backup from a standby server; you can't
write a restore-point WAL record in a standby.

If we want to avoid adding a new option for this, how about a magic
restore point called "consistent" or "immediate":

recovery_target_name='immediate'

That would stop recovery right after reaching consistency, but there
wouldn't be an actual restore point record in the WAL stream.

- Heikki



Re: Recovery target 'immediate'

From
Cédric Villemain
Date:
Le vendredi 3 mai 2013 15:40:51, Heikki Linnakangas a écrit :
> On 03.05.2013 16:29, Bruce Momjian wrote:
> > On Fri, May  3, 2013 at 01:02:08PM +0200, Cédric Villemain wrote:
> >>>>>> This changes the existing API which will confuse people that know it
> >>>>>> and invalidate everything written in software and on wikis as to how
> >>>>>> to do it. That means all the "in case of fire break glass"
> >>>>>> instructions are all wrong and need to be rewritten and retested.
> >>>>>
> >>>>> Yes, *that* is the main reason *not* to make the change. It has a
> >>>>> pretty bad cost in backwards compatibility loss. There is a gain, but
> >>>>> I don't think it outweighs the cost.
> >>>>
> >>>> So, is there a way to add this feature without breaking the API?
> >>>
> >>> Yes, by adding a new parameter exclusively used to control this
> >>> feature, something like recovery_target_immediate = 'on/off'.
> >>
> >> We just need to add a named restore point when ending the backup (in
> >> pg_stop_backup() ?).
> >> No API change required. Just document that some predefined target names
> >> are set during backup.
> >
> > So we auto-add a restore point based on the backup label.  Does that
> > work for everyone?
>
> Unfortunately, no. There are cases where you want to stop right after
> reaching consistency, but the point where you reach consistency is not
> at the end of a backup. For example, if you take a backup using an
> atomic filesystem snapshot, there are no pg_start/stop_backup calls, and
> the system will reach consistency after replaying all the WAL in
> pg_xlog. You might think that you can just not create a recovery.conf
> file in that case, or create a dummy recovery.conf file with
> restore_command='/bin/false'. However, then the system will not find the
> existing timeline history files in the archive, and can pick a TLI
> that's already in use. I found this out the hard way, and actually ended
> up writing a restore_command that restore timeline history files
> normally, but returns non-zero for any real other files; it wasn't pretty.

OK. I missed that you wanted that outside of pg_start/stop_backup() dance.

> If we want to avoid adding a new option for this, how about a magic
> restore point called "consistent" or "immediate":
>
> recovery_target_name='immediate'
>
> That would stop recovery right after reaching consistency, but there
> wouldn't be an actual restore point record in the WAL stream.

Back to your first email then.
+1 (as pointed by Simon, this is something we must document well: stopping at
'immediate' is sure to reduce your chance of recovering all the possible data
... opposite to recovery_target_name=ultimate, the default ;)  )

--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

Re: Recovery target 'immediate'

From
Robert Haas
Date:
On Fri, May 3, 2013 at 11:13 AM, Cédric Villemain
<cedric@2ndquadrant.com> wrote:
>> If we want to avoid adding a new option for this, how about a magic
>> restore point called "consistent" or "immediate":
>>
>> recovery_target_name='immediate'
>>
>> That would stop recovery right after reaching consistency, but there
>> wouldn't be an actual restore point record in the WAL stream.
>
> Back to your first email then.
> +1 (as pointed by Simon, this is something we must document well: stopping at
> 'immediate' is sure to reduce your chance of recovering all the possible data
> ... opposite to recovery_target_name=ultimate, the default ;)  )

Sounds good to me.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 3 May 2013 14:40, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
> On 03.05.2013 16:29, Bruce Momjian wrote:
>>
>> On Fri, May  3, 2013 at 01:02:08PM +0200, Cédric Villemain wrote:
>>>>>>>
>>>>>>> This changes the existing API which will confuse people that know it
>>>>>>> and invalidate everything written in software and on wikis as to how
>>>>>>> to do it. That means all the "in case of fire break glass"
>>>>>>> instructions are all wrong and need to be rewritten and retested.
>>>>>>
>>>>>>
>>>>>> Yes, *that* is the main reason *not* to make the change. It has a
>>>>>> pretty bad cost in backwards compatibility loss. There is a gain, but
>>>>>> I don't think it outweighs the cost.
>>>>>
>>>>>
>>>>> So, is there a way to add this feature without breaking the API?
>>>>
>>>>
>>>> Yes, by adding a new parameter exclusively used to control this feature,
>>>> something like recovery_target_immediate = 'on/off'.
>>>
>>>
>>> We just need to add a named restore point when ending the backup (in
>>> pg_stop_backup() ?).
>>> No API change required. Just document that some predefined target names
>>> are set
>>> during backup.
>>
>>
>> So we auto-add a restore point based on the backup label.  Does that
>> work for everyone?
>
>
> Unfortunately, no. There are cases where you want to stop right after
> reaching consistency, but the point where you reach consistency is not at
> the end of a backup. For example, if you take a backup using an atomic
> filesystem snapshot, there are no pg_start/stop_backup calls, and the system
> will reach consistency after replaying all the WAL in pg_xlog. You might
> think that you can just not create a recovery.conf file in that case, or
> create a dummy recovery.conf file with restore_command='/bin/false'.
> However, then the system will not find the existing timeline history files
> in the archive, and can pick a TLI that's already in use. I found this out
> the hard way, and actually ended up writing a restore_command that restore
> timeline history files normally, but returns non-zero for any real other
> files; it wasn't pretty.
>
> Another case is that you take a backup from a standby server; you can't
> write a restore-point WAL record in a standby.
>
> If we want to avoid adding a new option for this, how about a magic restore
> point called "consistent" or "immediate":
>
> recovery_target_name='immediate'
>
> That would stop recovery right after reaching consistency, but there
> wouldn't be an actual restore point record in the WAL stream.

recovery_target_name='something'

...works for me. Either constent or immediate works.

I request that the docs recommend this be used in conjunction with
pause_at_recovery_target = on, so that the user can begin inspecting
the database at the first available point and then roll forward from
that point if desired. That would cover my concern that this stopping
point is arbitrary and not intrinsically worth stopping at of itself.

Can I suggest that we discuss a range of related changes together? So
we have a roadmap of agreed changes in this area. That will be more
efficient than discussing each one individually; often each one makes
sense only as part of the wider context.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



Re: Recovery target 'immediate'

From
Heikki Linnakangas
Date:
On 07.05.2013 15:38, Simon Riggs wrote:
> On 3 May 2013 14:40, Heikki Linnakangas<hlinnakangas@vmware.com>  wrote:
>> If we want to avoid adding a new option for this, how about a magic restore
>> point called "consistent" or "immediate":
>>
>> recovery_target_name='immediate'
>>
>> That would stop recovery right after reaching consistency, but there
>> wouldn't be an actual restore point record in the WAL stream.
>
> recovery_target_name='something'
>
> ...works for me. Either constent or immediate works.
>
> I request that the docs recommend this be used in conjunction with
> pause_at_recovery_target = on, so that the user can begin inspecting
> the database at the first available point and then roll forward from
> that point if desired. That would cover my concern that this stopping
> point is arbitrary and not intrinsically worth stopping at of itself.

Sounds good. I've added this to the TODO.

> Can I suggest that we discuss a range of related changes together? So
> we have a roadmap of agreed changes in this area. That will be more
> efficient than discussing each one individually; often each one makes
> sense only as part of the wider context.

Sure, do you have something else in mind related to this?

- Heikki



Re: Recovery target 'immediate'

From
Fujii Masao
Date:
On Tue, May 7, 2013 at 9:38 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 3 May 2013 14:40, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:
>> On 03.05.2013 16:29, Bruce Momjian wrote:
>>>
>>> On Fri, May  3, 2013 at 01:02:08PM +0200, Cédric Villemain wrote:
>>>>>>>>
>>>>>>>> This changes the existing API which will confuse people that know it
>>>>>>>> and invalidate everything written in software and on wikis as to how
>>>>>>>> to do it. That means all the "in case of fire break glass"
>>>>>>>> instructions are all wrong and need to be rewritten and retested.
>>>>>>>
>>>>>>>
>>>>>>> Yes, *that* is the main reason *not* to make the change. It has a
>>>>>>> pretty bad cost in backwards compatibility loss. There is a gain, but
>>>>>>> I don't think it outweighs the cost.
>>>>>>
>>>>>>
>>>>>> So, is there a way to add this feature without breaking the API?
>>>>>
>>>>>
>>>>> Yes, by adding a new parameter exclusively used to control this feature,
>>>>> something like recovery_target_immediate = 'on/off'.
>>>>
>>>>
>>>> We just need to add a named restore point when ending the backup (in
>>>> pg_stop_backup() ?).
>>>> No API change required. Just document that some predefined target names
>>>> are set
>>>> during backup.
>>>
>>>
>>> So we auto-add a restore point based on the backup label.  Does that
>>> work for everyone?
>>
>>
>> Unfortunately, no. There are cases where you want to stop right after
>> reaching consistency, but the point where you reach consistency is not at
>> the end of a backup. For example, if you take a backup using an atomic
>> filesystem snapshot, there are no pg_start/stop_backup calls, and the system
>> will reach consistency after replaying all the WAL in pg_xlog. You might
>> think that you can just not create a recovery.conf file in that case, or
>> create a dummy recovery.conf file with restore_command='/bin/false'.
>> However, then the system will not find the existing timeline history files
>> in the archive, and can pick a TLI that's already in use. I found this out
>> the hard way, and actually ended up writing a restore_command that restore
>> timeline history files normally, but returns non-zero for any real other
>> files; it wasn't pretty.
>>
>> Another case is that you take a backup from a standby server; you can't
>> write a restore-point WAL record in a standby.
>>
>> If we want to avoid adding a new option for this, how about a magic restore
>> point called "consistent" or "immediate":
>>
>> recovery_target_name='immediate'
>>
>> That would stop recovery right after reaching consistency, but there
>> wouldn't be an actual restore point record in the WAL stream.
>
> recovery_target_name='something'
>
> ...works for me. Either constent or immediate works.
>
> I request that the docs recommend this be used in conjunction with
> pause_at_recovery_target = on, so that the user can begin inspecting
> the database at the first available point and then roll forward from
> that point if desired.

And, we should forbid users from setting recovery_target_inclusive to false
when recovery_target_name is set to something like 'immediate'? Because
in this case, recovery would always end before reaching the consistent state
and fail.

Regards,

--
Fujii Masao



Re: Recovery target 'immediate'

From
Simon Riggs
Date:
On 7 May 2013 13:50, Heikki Linnakangas <hlinnakangas@vmware.com> wrote:

>> Can I suggest that we discuss a range of related changes together? So
>> we have a roadmap of agreed changes in this area. That will be more
>> efficient than discussing each one individually; often each one makes
>> sense only as part of the wider context.
>
>
> Sure, do you have something else in mind related to this?

Not right this second. But I feel it would be better to consider
things in a more top-down "what do we need in this area?" approach
than the almost random mechanisms we use now. Given each of us seems
to be equally surprised by what others are thinking, it would make
sense to have a broader topic-level discussion, make a list of the
thoughts and priorities in each area and discuss things as a whole.

--Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services