Simon Riggs <simon@2ndquadrant.com> writes:
> Should we use a different datatype than time_t for the commit timestamp,
> one that offers more fine grained differentiation between checkpoints?
Pretty much everybody supports gettimeofday() (time_t and separate
integer microseconds); you might as well use that. Note that the actual
resolution is not necessarily microseconds, and it'd still not be
certain that successive commits have distinct timestamps --- so maybe
this refinement would be pointless. You'll still have to design a user
interface that allows selection without the assumption of distinct
timestamps.
> - when we stop, keep reading records until EOF, just don't apply them.
> When we write a checkpoint at end of recovery, the unapplied
> transactions are buried alive, never to return.
> - stop where we stop, then force zeros to EOF, so that no possible
> record remains of previous transactions.
Go with plan B; it's best not to destroy data (what if you chose the
wrong restart point the first time)?
Actually this now reminds me of a discussion I had with Patrick
Macdonald some time ago. The DB2 practice in this connection is that
you *never* overwrite existing logfile data when recovering. Instead
you start a brand new xlog segment file, which is given a new "branch
number" so it can be distinguished from the future-time xlog segments
that you chose not to apply. I don't recall what the DB2 terminology
was exactly --- not "branch number" I don't think --- but anyway the
idea is that when you restart the database after an incomplete recovery,
you are now in a sort of parallel universe that has its own history
after the branch point (PITR stop point). You need to be able to
distinguish archived log segments of this parallel universe from those
of previous and subsequent incarnations. I'm not sure whether Vadim
intended our StartUpID to serve this purpose, but it could perhaps be
used that way, if we reflected it in the WAL file names.
regards, tom lane