Thread: RE: postgres 7.2 features.

RE: postgres 7.2 features.

From
"Mikheev, Vadim"
Date:
> Is WAL planned for 7.1? What is the story with WAL?

Yes.

> I'm a bit concerned that the current storage manager is going to be
> thrown in the bit bucket without any thought for its benefits. There's
> some stuff I want to do with it like resurrecting time travel,

Why don't use triggers for time-travel?
Disadvantages of transaction-commit-time based time travel was pointed out
a days ago.

> some database replication stuff which can make use of the non-destructive

It was mentioned here that triggers could be used for async replication,
as well as WAL.

> storage method etc. There's a whole lot of interesting stuff that can be
> done with the current storage manager.

Vadim


Re: postgres 7.2 features.

From
Chris Bitmead
Date:
"Mikheev, Vadim" wrote:
> Yes.
> 
> > I'm a bit concerned that the current storage manager is going to be
> > thrown in the bit bucket without any thought for its benefits. There's
> > some stuff I want to do with it like resurrecting time travel,
> 
> Why don't use triggers for time-travel?
> Disadvantages of transaction-commit-time based time travel was pointed out
> a days ago.

Triggers for time travel is MUCH less efficient. There is no copying
involved
either in memory or on disk with the original postgres time travel, nor
is
there any logic to be executed. Then you've got to figure out strategies
for efficiently deleting old data if you want to. The old postgres was
the 
Right Thing, if you want access to time travel.

> It was mentioned here that triggers could be used for async replication,
> as well as WAL.

Same story. Major inefficency. Replication is tough enough without
mucking
around with triggers. Once the trigger executes you've got to go and
store
the data in the database again anyway. Then figure out when to delete
it.

> > storage method etc. There's a whole lot of interesting stuff that can be
> > done with the current storage manager.
> 
> Vadim


RE: postgres 7.2 features.

From
"Mikheev, Vadim"
Date:
> > > some stuff I want to do with it like resurrecting time travel,
> > 
> > Why don't use triggers for time-travel?
> > Disadvantages of transaction-commit-time based time travel 
> > was pointed out a days ago.
> 
> Triggers for time travel is MUCH less efficient. There is no copying
> involved either in memory or on disk with the original postgres time
> travel, nor is there any logic to be executed.

With the original TT:

- you are not able to use indices to fetch tuples on time base;
- you are not able to control tuples life time;
- you have to store commit time somewhere;
- you have to store additional 8 bytes for each tuple;
- 1 sec could be tooo long time interval for some uses of TT.

And, btw, what could be *really* very useful it's TT + referential integrity
feature. How could it be implemented without triggers?

Imho, triggers can give you much more flexible and useful TT...

Also note that TT was removed from Illustra and authors wrote that
built-in TT could be implemented without non-overwriting smgr.

> > It was mentioned here that triggers could be used for async 
> > replication, as well as WAL.
> 
> Same story. Major inefficency. Replication is tough enough without
> mucking around with triggers. Once the trigger executes you've got
> to go and store the data in the database again anyway. Then figure
> out when to delete it.

What about reading WAL to get and propagate changes? I don't think that
reading tables will be more efficient and, btw, 
how to know what to read (C) -:) ?

Vadim


Re: postgres 7.2 features.

From
Chris Bitmead
Date:
The bottom line is that the original postgres time-travel implementation
was totally cost-free. Actually it may have even speeded things
up since vacuum would have less work to do. Can you convince me that
triggers can compare anywhere near for performance? I can't see how.
All I'm asking is don't damage anything that is in postgres now that
is relevant to time-travel in your quest for WAL....

> With the original TT:
> 
> - you are not able to use indices to fetch tuples on time base;

Sounds not very hard to fix..

> - you are not able to control tuples life time;

From the docs... "Applications that do not want to save historical data
can sepicify a cutoff point for a relation. Cutoff points are defined by
the discard command" The command "discard EMP before "1 week"
deletes data in the EMP relation that is more than 1 week old".

> - you have to store commit time somewhere;

Ok, so?

> - you have to store additional 8 bytes for each tuple;

A small price for time travel.

> - 1 sec could be tooo long time interval for some uses of TT.

So someone in the future can implement finer grains. If time travel
disappears this option is not open.

> And, btw, what could be *really* very useful it's TT + referential integrity
> feature. How could it be implemented without triggers?

In what way does TT not have referential integrity? As long as the
system
assures that every transaction writes the same timestamp to all tuples
then
referential integrity continues to exist.

> Imho, triggers can give you much more flexible and useful TT...
> 
> Also note that TT was removed from Illustra and authors wrote that
> built-in TT could be implemented without non-overwriting smgr.

Of course it can be, but can it be done anywhere near as efficiently?

> > > It was mentioned here that triggers could be used for async
> > > replication, as well as WAL.
> >
> > Same story. Major inefficency. Replication is tough enough without
> > mucking around with triggers. Once the trigger executes you've got
> > to go and store the data in the database again anyway. Then figure
> > out when to delete it.
> 
> What about reading WAL to get and propagate changes? I don't think that
> reading tables will be more efficient and, btw,
> how to know what to read (C) -:) ?

Maybe that is a good approach, but it's not clear that it is the best.
More research is needed. With the no-overwrite storage manager there
exists a mechanism for deciding how long a tuple exists and this
can easily be tapped into for replication purposes. Vacuum could 
serve two purposes of vacuum and replicate.


Re: Storage Manager (was postgres 7.2 features.)

From
Chris Bitmead
Date:
Also, does WAL offer instantaneous crash recovery like no-overwrite?


"Mikheev, Vadim" wrote:
> 
> > > > some stuff I want to do with it like resurrecting time travel,
> > >
> > > Why don't use triggers for time-travel?
> > > Disadvantages of transaction-commit-time based time travel
> > > was pointed out a days ago.
> >
> > Triggers for time travel is MUCH less efficient. There is no copying
> > involved either in memory or on disk with the original postgres time
> > travel, nor is there any logic to be executed.
> 
> With the original TT:
> 
> - you are not able to use indices to fetch tuples on time base;
> - you are not able to control tuples life time;
> - you have to store commit time somewhere;
> - you have to store additional 8 bytes for each tuple;
> - 1 sec could be tooo long time interval for some uses of TT.
> 
> And, btw, what could be *really* very useful it's TT + referential integrity
> feature. How could it be implemented without triggers?
> 
> Imho, triggers can give you much more flexible and useful TT...
> 
> Also note that TT was removed from Illustra and authors wrote that
> built-in TT could be implemented without non-overwriting smgr.
> 
> > > It was mentioned here that triggers could be used for async
> > > replication, as well as WAL.
> >
> > Same story. Major inefficency. Replication is tough enough without
> > mucking around with triggers. Once the trigger executes you've got
> > to go and store the data in the database again anyway. Then figure
> > out when to delete it.
> 
> What about reading WAL to get and propagate changes? I don't think that
> reading tables will be more efficient and, btw,
> how to know what to read (C) -:) ?
> 
> Vadim


Re: postgres 7.2 features.

From
Bruce Momjian
Date:
> 
> The bottom line is that the original postgres time-travel implementation
> was totally cost-free. Actually it may have even speeded things
> up since vacuum would have less work to do. Can you convince me that
> triggers can compare anywhere near for performance? I can't see how.
> All I'm asking is don't damage anything that is in postgres now that
> is relevant to time-travel in your quest for WAL....

Basically, time travel was getting in the way of more requested features
that had to be added.  Keeping it around has a cost, and no one felt the
cost was worth the benefit. You may disagree, but at the time, that was
the consensus, and I assume it still is.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: postgres 7.2 features.

From
Chris Bitmead
Date:
Bruce Momjian wrote:
> 
> >
> > The bottom line is that the original postgres time-travel implementation
> > was totally cost-free. Actually it may have even speeded things
> > up since vacuum would have less work to do. Can you convince me that
> > triggers can compare anywhere near for performance? I can't see how.
> > All I'm asking is don't damage anything that is in postgres now that
> > is relevant to time-travel in your quest for WAL....
> 
> Basically, time travel was getting in the way of more requested features

Do you mean way back when it was removed? How was it getting in the way?

> that had to be added.  Keeping it around has a cost, and no one felt the
> cost was worth the benefit. You may disagree, but at the time, that was
> the consensus, and I assume it still is.


Re: postgres 7.2 features.

From
Philip Warner
Date:
At 11:27 11/07/00 +1000, Chris Bitmead wrote:
>
>> It was mentioned here that triggers could be used for async replication,
>> as well as WAL.
>
>Same story. Major inefficency. Replication is tough enough without
>mucking
>around with triggers. Once the trigger executes you've got to go and
>store
>the data in the database again anyway. Then figure out when to delete
>it.
>

The WAL *should* be the most efficient technique for replication (this said
without actually having seen it ;-}). 



----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.C.N. 008 659 498)             |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: Storage Manager (was postgres 7.2 features.)

From
Chris Bitmead
Date:
Has sufficient research been done to warrant destruction of what is
currently there?

According to the postgres research papers, the no-overwrite storage
manager has the following attributes...

* It's always faster than WAL in the presence of stable main memory.
(Whether the stable caches in modern disk drives is an approximation I
don't know).

* It's more scalable and has less logging contention. This allows
greater scalablility in the presence of multiple processors.

* Instantaneous crash recovery.

* Time travel is available at no cost.

* Easier to code and prove correctness. (I used to work for a database
company that implemented WAL, and it took them a large number of years
before they supposedly corrected every bug and crash condition on
recovery).

* Ability to keep archival records on an archival medium.

Is there any research on the level of what was done previously to
warrant abandoning these benefits? Obviously WAL has its own benefits, I
just don't want to see the current benefits lost.


Re: Storage Manager (was postgres 7.2 features.)

From
JanWieck@t-online.de (Jan Wieck)
Date:
Chris Bitmead wrote:
>
> Has sufficient research been done to warrant destruction of what is
> currently there?
   What's  currently there doesn't have TT any more. So there is   nothing we would destroy with an overwriting SMGR.

> According to the postgres research papers, the no-overwrite storage
> manager has the following attributes...
   I started using (and hacking) Postgres in version 4.2,  which   was  the  last official release from Stonebrakers
teamat UCB   (and the last one with the PostQUEL query language).
 
   The no-overwriting SMGR concept was one of  the  things,  the   entire  project  should  aprove.  The  idea  was  to
combine   rollback and logging information with  the  data  itself,  by   only  storing  new  values  and  remembering
when something   appeared or disappeared. Stable memory just means "if I  know   my write made it to some point, I can
readit back later even   in the case of a crash".
 
   This has never been implemented to a degree that  is  capable   to  catch hardware failures like unexpected loss of
power.So   the project finally told "it might be possible".  Many  other   questions have been answered by the project,
butexactly this   one is still open.
 

> * It's always faster than WAL in the presence of stable main memory.
> (Whether the stable caches in modern disk drives is an approximation I
> don't know).
   For writing, yes. But for high updated tables, the scans will   soon slow down due to the junk contention.

> * It's more scalable and has less logging contention. This allows
> greater scalablility in the presence of multiple processors.
>
> * Instantaneous crash recovery.
   Because  this never worked reliable, Vadim is working on WAL.

> * Time travel is available at no cost.
>
> * Easier to code and prove correctness. (I used to work for a database
> company that implemented WAL, and it took them a large number of years
> before they supposedly corrected every bug and crash condition on
> recovery).
>
> * Ability to keep archival records on an archival medium.
   Has this ever been implemented?

> Is there any research on the level of what was done previously to
> warrant abandoning these benefits? Obviously WAL has its own benefits, I
> just don't want to see the current benefits lost.
   I see your points. Maybe we can leave the no-overwriting SMGR   in the code, and just make the new one the default.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #




Re: postgres 7.2 features.

From
Bruce Momjian
Date:
> Bruce Momjian wrote:
> > 
> > >
> > > The bottom line is that the original postgres time-travel implementation
> > > was totally cost-free. Actually it may have even speeded things
> > > up since vacuum would have less work to do. Can you convince me that
> > > triggers can compare anywhere near for performance? I can't see how.
> > > All I'm asking is don't damage anything that is in postgres now that
> > > is relevant to time-travel in your quest for WAL....
> > 
> > Basically, time travel was getting in the way of more requested features
> 
> Do you mean way back when it was removed? How was it getting in the way?

Yes.  Every tuple had this time-thing that had to be tested.  Vadim
wanted to revove it to clear up the coding, and we all agreed.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Storage Manager (was postgres 7.2 features.)

From
Chris Bitmead
Date:
Jan Wieck wrote:

>     What's  currently there doesn't have TT any more. So there is
>     nothing we would destroy with an overwriting SMGR.

I know, but I wanted to resurrect it at some stage, and I think a lot of
important bits are still there.

> > * It's always faster than WAL in the presence of stable main memory.
> > (Whether the stable caches in modern disk drives is an approximation I
> > don't know).
> 
>     For writing, yes. But for high updated tables, the scans will
>     soon slow down due to the junk contention.

I imagine highly updated applications won't be interested in time
travel. If they are then the alternative of a user-maintained time-stamp
and triggers will still leave you with "junk".

> > * Instantaneous crash recovery.
> 
>     Because  this never worked reliable, Vadim is working on WAL.

Postgres recovery is not reliable?


Re: postgres 7.2 features.

From
Chris Bitmead
Date:
Bruce Momjian wrote:

> > Do you mean way back when it was removed? How was it getting in the way?
> 
> Yes.  Every tuple had this time-thing that had to be tested.  Vadim
> wanted to revove it to clear up the coding, and we all agreed.

And did that save a lot of code?


Re: postgres 7.2 features.

From
Bruce Momjian
Date:
> Bruce Momjian wrote:
> 
> > > Do you mean way back when it was removed? How was it getting in the way?
> > 
> > Yes.  Every tuple had this time-thing that had to be tested.  Vadim
> > wanted to revove it to clear up the coding, and we all agreed.
> 
> And did that save a lot of code?
> 

It simplified the code.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


RE: postgres 7.2 features.

From
"Mikheev, Vadim"
Date:
> The bottom line is that the original postgres time-travel 
> implementation was totally cost-free. 

I disagree. I can't consider additional > 8 bytes per tuple +
pg_time (4 bytes per transaction... please remember that ppl
complain even about pg_log - 2 bits per transaction) as
totally cost-free for half-useful built-in feature used
by 10% of users.
Note that I don't talk about overwriting/non-overwriting smgr at all!
It's not issue. There are no problems with keeping dead tuples in files
as long as required. When I told about new smgr I meant ability to re-use
space without vacuum and store > 1 tables per file.
But I'll object storing transaction commit times in tuple header and
old-designed pg_time. If you want to do TT - welcome... but make
it optional, without affect for those who need not in TT.

> Actually it may have even speeded things up since vacuum would have
> less work to do.

This would make happy only *TT users* -:)

> Can you convince me that triggers can compare anywhere near for
performance?

No, they can't. But this is bad only for *TT users* -:)

> I can't see how. All I'm asking is don't damage anything that is in
postgres
> now that is relevant to time-travel in your quest for WAL....

It's not related to WAL!
Though... With WAL pg_log is not required to be permanent: we could re-use
transaction IDs after db restart... Well, seems we can handle this.

> > With the original TT:
> > 
> > - you are not able to use indices to fetch tuples on time base;
> 
> Sounds not very hard to fix..

Really? Commit time is unknown till commit - so you would have to insert
index tuples just before commit... how to know what insert?

> > - you are not able to control tuples life time;
> 
> From the docs... "Applications that do not want to save 
> historical data can sepicify a cutoff point for a relation.
> Cutoff points are defined by the discard command"

I meant another thing: when I have to deal with history,
I need sometimes to change historical date-s (c) -:))
Probably we can handle this as well, just some additional
complexity -:)

> > - you have to store commit time somewhere;
> 
> Ok, so?

Space.

> > - you have to store additional 8 bytes for each tuple;
> 
> A small price for time travel.

Not for those who aren't going to use TT at all.
Lower performance of trigger implementation is smaller price for me.

> > - 1 sec could be tooo long time interval for some uses of TT.
> 
> So someone in the future can implement finer grains. If time travel
> disappears this option is not open.

Opened, with triggers -:)
As well as Colour-Travel and all other travels -:)

> > And, btw, what could be *really* very useful it's TT + 
> > referential integrity feature. How could it be implemented without
triggers?
> 
> In what way does TT not have referential integrity? As long as the
> system assures that every transaction writes the same timestamp to all
> tuples then referential integrity continues to exist.

The same tuple of a table with PK may be updated many times by many
transactions
in 1 second. For 1 sec grain you would read *many* historical tuples with
the same
PK all valid in the same time. So, we need in "finer grains" right now...

> > Imho, triggers can give you much more flexible and useful TT...
> > 
> > Also note that TT was removed from Illustra and authors wrote that
> > built-in TT could be implemented without non-overwriting smgr.
> 
> Of course it can be, but can it be done anywhere near as efficiently?

But without losing efficiency where TT is not required.

> > > > It was mentioned here that triggers could be used for async
> > > > replication, as well as WAL.
> > >
> > > Same story. Major inefficency. Replication is tough enough without
> > > mucking around with triggers. Once the trigger executes you've got
> > > to go and store the data in the database again anyway. Then figure
> > > out when to delete it.
> > 
> > What about reading WAL to get and propagate changes? I 
> > don't think that reading tables will be more efficient and, btw,
> > how to know what to read (C) -:) ?
> 
> Maybe that is a good approach, but it's not clear that it is the best.
> More research is needed. With the no-overwrite storage manager there
> exists a mechanism for deciding how long a tuple exists and this
> can easily be tapped into for replication purposes. Vacuum could 

This "mechanism" (just additional field in pg_class) can be used
for WAL based replication as well.

> serve two purposes of vacuum and replicate.

Vacuum is already slow, it's better to make it faster than ever slower...
I see vacuum as *optional* command someday... when we'll be able to
re-use space.

Vadim


RE: postgres 7.2 features.

From
"Mikheev, Vadim"
Date:
> > > Do you mean way back when it was removed? How was it 
> getting in the way?
> > 
> > Yes.  Every tuple had this time-thing that had to be tested.  Vadim
> > wanted to revove it to clear up the coding, and we all agreed.
> 
> And did that save a lot of code?

This removed one fsync per commit and saved a lot of space.

Vadim


Re: postgres 7.2 features.

From
Chris Bitmead
Date:
"Mikheev, Vadim" wrote:

> > > - 1 sec could be tooo long time interval for some uses of TT.
> >
> > So someone in the future can implement finer grains. If time travel
> > disappears this option is not open.
> 
> Opened, with triggers -:)
> As well as Colour-Travel and all other travels -:)

Maybe you're right and time-travel should be relegated to the dustbin of
history. But it always seemed a really neat design ever since I read
about it 8 years ago or something.

It does seem to me that time is a much more fundamental idea to model
explicitely than Colour or any other thing you might dream up. The
concept that a data-store has a history is a very fundamental concept.

This can get very philosophical. Think about the difference between a
pure-functional programming language and a regular programming language.
One way of looking at it is that a pure-functional language models time
explicitely whereas a regular language models time implicitely. In a
pure-functional language a change of state is brought about by creating
a whole new state, never by destroying the previous state. The previous
state continues to exist as long as you have a need for it. Since I'm a
fan of pure functional languages this idea appeals to me.

On a practical note, the postgres time travel was very easy to use. It's
hard for me to see how a trigger mechanism could be as easy. For example
by default SELECT would always get the current values - sensible. If you
want historical values you have to add extra conditions, in a simple to
use syntax. The database took care of destroying historical data
according to your parameters. Can a trigger mechanism really make things
this easy?