> Adding the stop time column should be a simple addition and I don't see > a problem with that. I think I misunderstood your original request on > that. Because you are just talking about returning a timestamptz with > the "right now" value for when you called pg_stop_backup()? Or to be > specific, just before pg_Stop_backup *finished*. Or do you mean when > pg_stop_backup() started?
What would be ideal is the minimum time that could be used for PITR. In an exclusive backup that's the time the end-of-backup record is written to WAL. In a non-exlusive backup I'm not quite sure how that works.
I guess I was hoping that you would know. I fine with just getting the current timestamp as is currently done in do_pg_stop_backup(). It's not perfect but it will be pretty close.
I thought some more about putting STOP_WAL_LOCATION into the backup label and I think this is an important step. Without that the recovery process needs to use pg_control to determine when the database has reach consistency and that will only work if pg_control was copied last.
In summary, I think the patch should be updated to include stop_time as a column and add STOP_WAL_LOCATION and STOP_TIME to backup_label. The recovery process should read STOP_WAL_LOCATION from backup_label and use that to decide when the database has become consistent.
Is that the only reason we need pg_control to be copied last? I thought there were other reasons for that.
I was chatting with Stephen about this earlier, and one thing we considered was that maybe we should return pg_control as a bytea field from pg_stop_backup() thereby strongly hinting to tools that they should write it out from there rather than copy it from the data directory (we can't stop them from copying it from the data directory like we do for backup_label of course, but if we return the contents and document that's how it should be done, it's pretty clear).
But if we can actually remove the whole requirement on the order of the copy of pg_control by doing that it certainly seems worthwhile.
So - I can definitely see the argument for returning the stop wal *location*. But I'm still not sure what the definition of the time would be? We can't return it before we know what it means...