Hello All,
Does anyone really know how this multi version concurrency really works?
The basic idea is simple, never update anything so that you can always get
read-consistent queries by reading behind the updates. But the details are
missing in the docs. In particular:-
1. How long do previous version hang arround? As long as the oldest
transaction, or until a Vacuum?
2. How expensive are they? Do the multi-version copies clutter the main
database blocks, or are they stored elsewhere? And if I update one field of
a row, is the entire row copied or just the field?
3. What does Vacuum really do? Will it interfere with open transactions, or
will it know to leave multi version copies for existing transactions?
PS. I have done a few experiements on how the locking really works, see
http://www.SimpleORM.org/DBNotes.html. Note the difference with Oracle,
which I think is better and easy to implement given the basic MV approach.
PPS. Design issues:-
1. Why not just use the transaction log to implement MV? Each record only
needs to store a pointer into the logs to the previous update. And the logs
would log that pointer, forming a backward pointing linked list. Most of
the time this will not be needed, or will refer to a very recently written
tail of the log which will already be cached in memory. This approach
would put an end to Oracle's issus with running out of "Rollback Segments".
It would also put an end to Postgres performance problems on inserts and
updates -- why be slower than MySQL? You need a transaction log anyway, so
no extra overhead should be added for MV unless a query actually needs to
read behind updated data.
2. If you are going to go to the trouble of keeping multi versions for
locking, why not make them available for the application as well? One of
the painful features to implement is audit trails of how records get to be
the way they are. The database could do that automatically.