Thread: Transaction system (proposal for 6.5)

Transaction system (proposal for 6.5)

From

Robson Miranda

Date:

16 September 1998, 17:36:50

Hi...


    I was thinking in a major rewrite of the PostrgreSQL transaction
system, in order to provide less tuple overhead and recoverabilty.

    My first goal is to reduce tuple overhead, getting rid of xmin/xman and
cmin/cmax. To provide this functionality, I'm planning to keep only a
flag indicating if the transaction is in curse or not. If, during a
transaction, a certain tuple is affected, this flag will store the
current transaction id. Thus, if this tuple is commited, an invalid OID
(say, 0), will be written to this flag.

    The only problem a saw using this approach is if some pages got flushed
during the transaction, because these pages will have to be reload from
disk.

    To address the problem of non-functional update, I pretend to store a
command identifier with the tuple, and, during update, see if the cid of
a tuple is equal of the current cid of this transaction (like we do
today).

    To keep track of current transactions, there will have a list of tuples
affected by this transaction, and the operation executed. This way,
during commit, we only confirm these operations in relations (writing an
invalid OID in current xid of each tuple affected). To rollback, we
delete the new tuples (and mark this operation as a commit) and mark the
old tuples affected as "live" (and leave these commited).

    I'm thinking of leave a transaction id for each new backend, and
postmaster will keep track of used transaction ids. This way, there is
no need to keep a list of transactions in shared memory.

    For recovery (my second goal), I pretend to, at startup of postmaster,
to rollback all marked in-curse transactions. After that, I'm thinking
about a redo log, but I'm still searching a way to keep it with the
minimum size possible.

  Sugestions? Comments?

        Robson.

Re: [HACKERS] Transaction system (proposal for 6.5)

From

Michael Meskes

Date:

17 September 1998, 13:05:39

On Wed, Sep 16, 1998 at 06:35:53PM -0300, Robson Miranda wrote:
>     I was thinking in a major rewrite of the PostrgreSQL transaction
> system, in order to provide less tuple overhead and recoverabilty.

I do not have much of an idea how postgres handles stuff right now, so
forgive me if I'm asking stupid questions.

>     My first goal is to reduce tuple overhead, getting rid of xmin/xman and
> cmin/cmax. To provide this functionality, I'm planning to keep only a
> flag indicating if the transaction is in curse or not. If, during a
> transaction, a certain tuple is affected, this flag will store the
> current transaction id. Thus, if this tuple is commited, an invalid OID
> (say, 0), will be written to this flag.

That means you store one flag per tuple? Does this happen only in memory?

>     The only problem a saw using this approach is if some pages got flushed
> during the transaction, because these pages will have to be reload from
> disk.

Ah yes, it seems to be in memory only. And you exactly point to one problem.
Any idea how to solve this?

>     To keep track of current transactions, there will have a list of tuples
> affected by this transaction, and the operation executed. This way,
> during commit, we only confirm these operations in relations (writing an
> invalid OID in current xid of each tuple affected). To rollback, we
> delete the new tuples (and mark this operation as a commit) and mark the
> old tuples affected as "live" (and leave these commited).

That means we always have both in the relation? That is we write the new
tuple in and keep the old one? Is this done the same way in the actual
version? I'd prefer to have a clean cut with new and old not being in the
same table at the same time.

>     For recovery (my second goal), I pretend to, at startup of postmaster,
> to rollback all marked in-curse transactions. After that, I'm thinking
> about a redo log, but I'm still searching a way to keep it with the
> minimum size possible.

Where's the problem with a redo log?

Michael
--
Dr. Michael Meskes      | Th.-Heuss-Str. 61, D-41812 Erkelenz | Go SF49ers!
Senior-Consultant       | business: Michael.Meskes@mummert.de | Go Rhein Fire!
Mummert+Partner         | private: Michael.Meskes@usa.net     | Use Debian
Unternehmensberatung AG |          Michael.Meskes@gmx.net     | GNU/Linux!

Re: [HACKERS] Transaction system (proposal for 6.5)

From

Bruce Momjian

Date:

20 September 1998, 22:01:45

> Hi...
>
>
>     I was thinking in a major rewrite of the PostrgreSQL transaction
> system, in order to provide less tuple overhead and recoverabilty.
>
>     My first goal is to reduce tuple overhead, getting rid of xmin/xman and
> cmin/cmax. To provide this functionality, I'm planning to keep only a
> flag indicating if the transaction is in curse or not. If, during a
> transaction, a certain tuple is affected, this flag will store the
> current transaction id. Thus, if this tuple is commited, an invalid OID
> (say, 0), will be written to this flag.
>
>     The only problem a saw using this approach is if some pages got flushed
> during the transaction, because these pages will have to be reload from
> disk.
>
>     To address the problem of non-functional update, I pretend to store a
> command identifier with the tuple, and, during update, see if the cid of
> a tuple is equal of the current cid of this transaction (like we do
> today).
>
>     To keep track of current transactions, there will have a list of tuples
> affected by this transaction, and the operation executed. This way,
> during commit, we only confirm these operations in relations (writing an
> invalid OID in current xid of each tuple affected). To rollback, we
> delete the new tuples (and mark this operation as a commit) and mark the
> old tuples affected as "live" (and leave these commited).
>
>     I'm thinking of leave a transaction id for each new backend, and
> postmaster will keep track of used transaction ids. This way, there is
> no need to keep a list of transactions in shared memory.
>
>     For recovery (my second goal), I pretend to, at startup of postmaster,
> to rollback all marked in-curse transactions. After that, I'm thinking
> about a redo log, but I'm still searching a way to keep it with the
> minimum size possible.


Interesting.  I know we have talked in the past about the various system
columns and their removal.  If you check the hackers archive under cmin,
etc, I think you will find some discussion.

Now, as far as their removal, is it worth removing 8 bytes of tuple
overhead for the gain of having to do a redo log, etc.  I am not sure.
I know many commercial databases have it, but I am not sure how
benificial it would be.

What I would really like is the ability to re-use superceeded tuples
without vacuum.  It seems that should be possible, but it has not been
done by anyone yet.  That would be a HUGE win, I think.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
http://www.op.net/~candle              |  (610) 353-9879(w)
  +  If your life is a hard drive,     |  (610) 853-3000(h)
  +  Christ can be your backup.        |

Re: [HACKERS] Transaction system (proposal for 6.5)

From

The Hermit Hacker

Date:

21 September 1998, 02:03:51

On Sun, 20 Sep 1998, Bruce Momjian wrote:

> > Hi...
> >
> >
> >     I was thinking in a major rewrite of the PostrgreSQL transaction
> > system, in order to provide less tuple overhead and recoverabilty.
> >
> >     My first goal is to reduce tuple overhead, getting rid of xmin/xman and
> > cmin/cmax. To provide this functionality, I'm planning to keep only a
> > flag indicating if the transaction is in curse or not. If, during a
> > transaction, a certain tuple is affected, this flag will store the
> > current transaction id. Thus, if this tuple is commited, an invalid OID
> > (say, 0), will be written to this flag.
> >
> >     The only problem a saw using this approach is if some pages got flushed
> > during the transaction, because these pages will have to be reload from
> > disk.
> >
> >     To address the problem of non-functional update, I pretend to store a
> > command identifier with the tuple, and, during update, see if the cid of
> > a tuple is equal of the current cid of this transaction (like we do
> > today).
> >
> >     To keep track of current transactions, there will have a list of tuples
> > affected by this transaction, and the operation executed. This way,
> > during commit, we only confirm these operations in relations (writing an
> > invalid OID in current xid of each tuple affected). To rollback, we
> > delete the new tuples (and mark this operation as a commit) and mark the
> > old tuples affected as "live" (and leave these commited).
> >
> >     I'm thinking of leave a transaction id for each new backend, and
> > postmaster will keep track of used transaction ids. This way, there is
> > no need to keep a list of transactions in shared memory.
> >
> >     For recovery (my second goal), I pretend to, at startup of postmaster,
> > to rollback all marked in-curse transactions. After that, I'm thinking
> > about a redo log, but I'm still searching a way to keep it with the
> > minimum size possible.
>
>
> Interesting.  I know we have talked in the past about the various system
> columns and their removal.  If you check the hackers archive under cmin,
> etc, I think you will find some discussion.
>
> Now, as far as their removal, is it worth removing 8 bytes of tuple
> overhead for the gain of having to do a redo log, etc.  I am not sure.
> I know many commercial databases have it, but I am not sure how
> benificial it would be.

    I may be missing something in the original posting that you are
seeing, but I don't see the two as necesarily being inter-related...my
understanding of the Oracle redo logs is that if a database corrupts, you
can rebuild it from the last backup + the redo logs to get to the same
point as where the corruption happened...

> What I would really like is the ability to re-use superceeded tuples
> without vacuum.  It seems that should be possible, but it has not been
> done by anyone yet.  That would be a HUGE win, I think.

    Not sure, but IMHO, having a redo log capability would be a HUGE
win also...consider a mission critical application that doesn't have, in
essence, "live backups" in the form of a redo log...

Marc G. Fournier
Systems Administrator @ hub.org
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org

Re: [HACKERS] Transaction system (proposal for 6.5)

From

Michael Graff

Date:

21 September 1998, 11:55:40

The Hermit Hacker <scrappy@hub.org> writes:

>     Not sure, but IMHO, having a redo log capability would be a HUGE
> win also...consider a mission critical application that doesn't have, in
> essence, "live backups" in the form of a redo log...

Considering that every postgresql application I write has two backups:

    o a full database dump,
    o an incremental change log,

so I can do exactly that...

--Michael

Re: [HACKERS] Transaction system (proposal for 6.5)

From

Vadim Mikheev

Date:

24 September 1998, 22:43:53

Robson Miranda wrote:
>
>         I was thinking in a major rewrite of the PostrgreSQL transaction
> system, in order to provide less tuple overhead and recoverabilty.
>
>         My first goal is to reduce tuple overhead, getting rid of xmin/xman and
> cmin/cmax. To provide this functionality, I'm planning to keep only a

I need in xmin & xmax for multi-version concurrency control...
Let's decide what should be implemented in 6.5...

>         To address the problem of non-functional update, I pretend to store a
> command identifier with the tuple, and, during update, see if the cid of
> a tuple is equal of the current cid of this transaction (like we do
> today).

cmin & cmax very simplifies implementation of data changes
visibility rules - I'm not sure is it ever possible to
do this having only one attribute for command id,
keeping in mind triggers, (PL/)SQL-funcs...

Vadim