Thread: Transaction system (proposal for 6.5)
Hi... I was thinking in a major rewrite of the PostrgreSQL transaction system, in order to provide less tuple overhead and recoverabilty. My first goal is to reduce tuple overhead, getting rid of xmin/xman and cmin/cmax. To provide this functionality, I'm planning to keep only a flag indicating if the transaction is in curse or not. If, during a transaction, a certain tuple is affected, this flag will store the current transaction id. Thus, if this tuple is commited, an invalid OID (say, 0), will be written to this flag. The only problem a saw using this approach is if some pages got flushed during the transaction, because these pages will have to be reload from disk. To address the problem of non-functional update, I pretend to store a command identifier with the tuple, and, during update, see if the cid of a tuple is equal of the current cid of this transaction (like we do today). To keep track of current transactions, there will have a list of tuples affected by this transaction, and the operation executed. This way, during commit, we only confirm these operations in relations (writing an invalid OID in current xid of each tuple affected). To rollback, we delete the new tuples (and mark this operation as a commit) and mark the old tuples affected as "live" (and leave these commited). I'm thinking of leave a transaction id for each new backend, and postmaster will keep track of used transaction ids. This way, there is no need to keep a list of transactions in shared memory. For recovery (my second goal), I pretend to, at startup of postmaster, to rollback all marked in-curse transactions. After that, I'm thinking about a redo log, but I'm still searching a way to keep it with the minimum size possible. Sugestions? Comments? Robson.
On Wed, Sep 16, 1998 at 06:35:53PM -0300, Robson Miranda wrote: > I was thinking in a major rewrite of the PostrgreSQL transaction > system, in order to provide less tuple overhead and recoverabilty. I do not have much of an idea how postgres handles stuff right now, so forgive me if I'm asking stupid questions. > My first goal is to reduce tuple overhead, getting rid of xmin/xman and > cmin/cmax. To provide this functionality, I'm planning to keep only a > flag indicating if the transaction is in curse or not. If, during a > transaction, a certain tuple is affected, this flag will store the > current transaction id. Thus, if this tuple is commited, an invalid OID > (say, 0), will be written to this flag. That means you store one flag per tuple? Does this happen only in memory? > The only problem a saw using this approach is if some pages got flushed > during the transaction, because these pages will have to be reload from > disk. Ah yes, it seems to be in memory only. And you exactly point to one problem. Any idea how to solve this? > To keep track of current transactions, there will have a list of tuples > affected by this transaction, and the operation executed. This way, > during commit, we only confirm these operations in relations (writing an > invalid OID in current xid of each tuple affected). To rollback, we > delete the new tuples (and mark this operation as a commit) and mark the > old tuples affected as "live" (and leave these commited). That means we always have both in the relation? That is we write the new tuple in and keep the old one? Is this done the same way in the actual version? I'd prefer to have a clean cut with new and old not being in the same table at the same time. > For recovery (my second goal), I pretend to, at startup of postmaster, > to rollback all marked in-curse transactions. After that, I'm thinking > about a redo log, but I'm still searching a way to keep it with the > minimum size possible. Where's the problem with a redo log? Michael -- Dr. Michael Meskes | Th.-Heuss-Str. 61, D-41812 Erkelenz | Go SF49ers! Senior-Consultant | business: Michael.Meskes@mummert.de | Go Rhein Fire! Mummert+Partner | private: Michael.Meskes@usa.net | Use Debian Unternehmensberatung AG | Michael.Meskes@gmx.net | GNU/Linux!
> Hi... > > > I was thinking in a major rewrite of the PostrgreSQL transaction > system, in order to provide less tuple overhead and recoverabilty. > > My first goal is to reduce tuple overhead, getting rid of xmin/xman and > cmin/cmax. To provide this functionality, I'm planning to keep only a > flag indicating if the transaction is in curse or not. If, during a > transaction, a certain tuple is affected, this flag will store the > current transaction id. Thus, if this tuple is commited, an invalid OID > (say, 0), will be written to this flag. > > The only problem a saw using this approach is if some pages got flushed > during the transaction, because these pages will have to be reload from > disk. > > To address the problem of non-functional update, I pretend to store a > command identifier with the tuple, and, during update, see if the cid of > a tuple is equal of the current cid of this transaction (like we do > today). > > To keep track of current transactions, there will have a list of tuples > affected by this transaction, and the operation executed. This way, > during commit, we only confirm these operations in relations (writing an > invalid OID in current xid of each tuple affected). To rollback, we > delete the new tuples (and mark this operation as a commit) and mark the > old tuples affected as "live" (and leave these commited). > > I'm thinking of leave a transaction id for each new backend, and > postmaster will keep track of used transaction ids. This way, there is > no need to keep a list of transactions in shared memory. > > For recovery (my second goal), I pretend to, at startup of postmaster, > to rollback all marked in-curse transactions. After that, I'm thinking > about a redo log, but I'm still searching a way to keep it with the > minimum size possible. Interesting. I know we have talked in the past about the various system columns and their removal. If you check the hackers archive under cmin, etc, I think you will find some discussion. Now, as far as their removal, is it worth removing 8 bytes of tuple overhead for the gain of having to do a redo log, etc. I am not sure. I know many commercial databases have it, but I am not sure how benificial it would be. What I would really like is the ability to re-use superceeded tuples without vacuum. It seems that should be possible, but it has not been done by anyone yet. That would be a HUGE win, I think. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 http://www.op.net/~candle | (610) 353-9879(w) + If your life is a hard drive, | (610) 853-3000(h) + Christ can be your backup. |
On Sun, 20 Sep 1998, Bruce Momjian wrote: > > Hi... > > > > > > I was thinking in a major rewrite of the PostrgreSQL transaction > > system, in order to provide less tuple overhead and recoverabilty. > > > > My first goal is to reduce tuple overhead, getting rid of xmin/xman and > > cmin/cmax. To provide this functionality, I'm planning to keep only a > > flag indicating if the transaction is in curse or not. If, during a > > transaction, a certain tuple is affected, this flag will store the > > current transaction id. Thus, if this tuple is commited, an invalid OID > > (say, 0), will be written to this flag. > > > > The only problem a saw using this approach is if some pages got flushed > > during the transaction, because these pages will have to be reload from > > disk. > > > > To address the problem of non-functional update, I pretend to store a > > command identifier with the tuple, and, during update, see if the cid of > > a tuple is equal of the current cid of this transaction (like we do > > today). > > > > To keep track of current transactions, there will have a list of tuples > > affected by this transaction, and the operation executed. This way, > > during commit, we only confirm these operations in relations (writing an > > invalid OID in current xid of each tuple affected). To rollback, we > > delete the new tuples (and mark this operation as a commit) and mark the > > old tuples affected as "live" (and leave these commited). > > > > I'm thinking of leave a transaction id for each new backend, and > > postmaster will keep track of used transaction ids. This way, there is > > no need to keep a list of transactions in shared memory. > > > > For recovery (my second goal), I pretend to, at startup of postmaster, > > to rollback all marked in-curse transactions. After that, I'm thinking > > about a redo log, but I'm still searching a way to keep it with the > > minimum size possible. > > > Interesting. I know we have talked in the past about the various system > columns and their removal. If you check the hackers archive under cmin, > etc, I think you will find some discussion. > > Now, as far as their removal, is it worth removing 8 bytes of tuple > overhead for the gain of having to do a redo log, etc. I am not sure. > I know many commercial databases have it, but I am not sure how > benificial it would be. I may be missing something in the original posting that you are seeing, but I don't see the two as necesarily being inter-related...my understanding of the Oracle redo logs is that if a database corrupts, you can rebuild it from the last backup + the redo logs to get to the same point as where the corruption happened... > What I would really like is the ability to re-use superceeded tuples > without vacuum. It seems that should be possible, but it has not been > done by anyone yet. That would be a HUGE win, I think. Not sure, but IMHO, having a redo log capability would be a HUGE win also...consider a mission critical application that doesn't have, in essence, "live backups" in the form of a redo log... Marc G. Fournier Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
The Hermit Hacker <scrappy@hub.org> writes: > Not sure, but IMHO, having a redo log capability would be a HUGE > win also...consider a mission critical application that doesn't have, in > essence, "live backups" in the form of a redo log... Considering that every postgresql application I write has two backups: o a full database dump, o an incremental change log, so I can do exactly that... --Michael
Robson Miranda wrote: > > I was thinking in a major rewrite of the PostrgreSQL transaction > system, in order to provide less tuple overhead and recoverabilty. > > My first goal is to reduce tuple overhead, getting rid of xmin/xman and > cmin/cmax. To provide this functionality, I'm planning to keep only a I need in xmin & xmax for multi-version concurrency control... Let's decide what should be implemented in 6.5... > To address the problem of non-functional update, I pretend to store a > command identifier with the tuple, and, during update, see if the cid of > a tuple is equal of the current cid of this transaction (like we do > today). cmin & cmax very simplifies implementation of data changes visibility rules - I'm not sure is it ever possible to do this having only one attribute for command id, keeping in mind triggers, (PL/)SQL-funcs... Vadim