Jan, et. al.,
On Jan 26, 2007, at 2:37 AM, Naz Gassiep wrote:
> I would be *very* concerned that system time is not a guaranteed
> monotonic entity. Surely a counter or other internally managed
> mechanism would be a better solution.
As you should be concerned. Looking on my desk through the last few
issues in IEEE Transactions on Parallel and Distributed Systems, I
see no time synch stuff for clusters of machines that is actually
based on time. Almost all rely on something like a Lamport timestamp
or some relaxation thereof. A few are based off a tree based pulse.
Using actual times is fraught with problems and is typically
inappropriate for cluster synchronization needs.
> Furthermore, what would be the ramifications of master and slave
> system times being out of sync?
I'm much more concerned with the overall approach. The algorithm for
replication should be published in theoretic style with a thorough
analysis of its assumptions and a proof of correctness based on those
assumptions. Databases and replication therein are definitely
technologies that aren't "off-the-cuff," and rigorous academic
discussion and acceptance before they will get adopted. People
generally will not adopt technologies to store mission critical data
until they are confident that it will both work as designed and work
as implemented -- the second is far less important as the weakness
there are simply bugs.
I'm not implying that this rigorous dissection of replication design
hasn't happened, but I didn't see it referenced anywhere in this
thread. Can you point me to it? I've reviewed many of these papers
and would like to better understand what you are aiming at.
Best regards,
Theo Schlossnagle