On Mon, Oct 24, 2016 at 1:26 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Fri, Oct 21, 2016 at 10:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> 2. In perform_base_backup(), if the endptr returned by
>> do_pg_stop_backup() precedes the end of the checkpoint record returned
>> by do_pg_start_backup(), use the latter value instead. Unfortunately,
>> that's not so easy: we can't just say if (endptr < starptr) startptr =
>> endptr; because startptr is the *start* of the checkpoint record, not
>> the end. I suspect somebody could figure out a solution to this
>> problem, though.
>>
>
> With this approach, don't we need something similar for API's
> pg_stop_backup()/pg_stop_backup_v2()?
Yes, I think so. That would sort of map with the idea I mentioned
upthread to have pg_stop_backup() return the contents of the control
file and have the caller write it to the backup by itself.
>> If we decide we don't want to aim for one of these tighter solutions
>> and just adopt the previously-discussed patch, then at the very least
>> it needs better comments.
>
> +1.
Yeah, here is an attempt at doing that:
- * We return the current minimum recovery point as the backup end
+ * We return the current last replayed point as the backup end
* location. Note that it can be greater than the exact backup end
- * location if the minimum recovery point is updated after the backup of
+ * location if the last replayed point is updated after the backup of
* pg_control. This is harmless for current uses.
*
+ * Using the last replayed point as the backup end location ensures that
+ * the end location will never be older than the start position, something
+ * that could happen if for example minRecoveryPoint is used as backup
+ * end location when it never gets updated because no buffer flushes
+ * occurred. By using the last replay location, note that the backup may
+ * include more WAL segments than necessary. If the additional WAL
+ * replayed since minRecoveryPoint does not include a checkpoint, there
+ * is actually no need for it. Even if it does include a checkpoint,
+ * only the portion up to the checkpoint itself is necessary and not
+ * the WAL generated beyond that. Still, in the case of a backup taken
+ * from a standby, with its master disconnected, this ensures that the
+ * backup is valid.
+ *
Thoughts welcome.
--
Michael