Thread: Please advise features in 7.1

Please advise features in 7.1

From
"John Huttley"
Date:
Hello,
I've looked at the resources available through the web page to CVS and other
stuff,
however I cant find a statement of whats likely to be in 7.1 and what is planned
for later.

Reason: I want to know if any of these features are scheduled.

1. Calculated fields in table definitions . eg.
  Create table test (           A Integer,           B integer,          the_sum   As  (A+B),
);

This is like MSSQL

2. Any parameterised triggers

3. Any parameterised stored procedures that return a result set.


These are _extraordinarily_ useful for application development.

If anyone has a way of bolting on any of these to 7.0, I'd be keen to hear from
you.

Regards

John




Re: Please advise features in 7.1

From
Tom Lane
Date:
"John Huttley" <John@mwk.co.nz> writes:
> Reason: I want to know if any of these features are scheduled.

> 1. Calculated fields in table definitions . eg.

>    Create table test (
>             A Integer,
>             B integer,
>            the_sum   As  (A+B),
> );

You can do that now (and for many versions past) with a trigger.
It's not quite as convenient as it ought to be, but it's possible.
AFAIK there's no change in that situation for 7.1.

> 2. Any parameterised triggers

We've had parameterized triggers for years.  Maybe you attach some
meaning to that term beyond what I do?

> 3. Any parameterised stored procedures that return a result set.

There is some support (dating back to Berkeley Postquel) for functions
returning sets, but it's pretty ugly and limited.  Proper support might
happen in 7.2 ...
        regards, tom lane


Re: Please advise features in 7.1

From
"john huttley"
Date:
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "John Huttley" <John@mwk.co.nz>
Cc: <pgsql-hackers@postgresql.org>
Sent: Thursday, 23 November 2000 19:05
Subject: Re: [HACKERS] Please advise features in 7.1


> "John Huttley" <John@mwk.co.nz> writes:
> > Reason: I want to know if any of these features are scheduled.
>
> > 1. Calculated fields in table definitions . eg.
>
> >    Create table test (
> >             A Integer,
> >             B integer,
> >            the_sum   As  (A+B),
> > );
>
> You can do that now (and for many versions past) with a trigger.
> It's not quite as convenient as it ought to be, but it's possible.
> AFAIK there's no change in that situation for 7.1.
>


Yes,  Perhaps defining the table with a dummy field and setting up a
'before'
trigger which replaced that field with a calculated value?

Messy but feasible.


> > 2. Any parameterised triggers
>
> We've had parameterized triggers for years.  Maybe you attach some
> meaning to that term beyond what I do?

I'm referring to the manual that says functions used for triggers must have
no parameters
and return a type Opaque. And indeed it is impossible to create a trigger
from a plSQL function that takes any parameters.

Thus if we have a lot of triggers which are very similar, we cannot just use
one function
and pass an identifying  parameter or two to it. We must create an
individual function for each trigger.

Its irritating more than fatal.

> > 3. Any parameterised stored procedures that return a result set.
>
> There is some support (dating back to Berkeley Postquel) for functions
> returning sets, but it's pretty ugly and limited.  Proper support might
> happen in 7.2 ...

Something to look forward to! Meanwhile I'll have a play and see if its
possible to use a read trigger
to populate a temporary table. hmm, that might require a statement level
trigger. Another thing for 7.2,
i guess.

The application programming we are doing now utilises stored procedures
returning record sets
(MSSQL) and the lack is showstopper in our migration plans. Sigh.


Thanks Tom

Regards


John




Re: Please advise features in 7.1

From
"john huttley"
Date:
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "John Huttley" <John@mwk.co.nz>
Cc: <pgsql-hackers@postgresql.org>
Sent: Thursday, 23 November 2000 19:05
Subject: Re: [HACKERS] Please advise features in 7.1


> "John Huttley" <John@mwk.co.nz> writes:
> > Reason: I want to know if any of these features are scheduled.
>
> > 1. Calculated fields in table definitions . eg.
>
> >    Create table test (
> >             A Integer,
> >             B integer,
> >            the_sum   As  (A+B),
> > );
>
> You can do that now (and for many versions past) with a trigger.
> It's not quite as convenient as it ought to be, but it's possible.
> AFAIK there's no change in that situation for 7.1.
>


Yes,  Perhaps defining the table with a dummy field and setting up a
'before'
trigger which replaced that field with a calculated value?

Messy but feasible.


> > 2. Any parameterised triggers
>
> We've had parameterized triggers for years.  Maybe you attach some
> meaning to that term beyond what I do?

I'm referring to the manual that says functions used for triggers must have
no parameters
and return a type Opaque. And indeed it is impossible to create a trigger
from a plSQL function that takes any parameters.

Thus if we have a lot of triggers which are very similar, we cannot just use
one function
and pass an identifying  parameter or two to it. We must create an
individual function for each trigger.

Its irritating more than fatal.

> > 3. Any parameterised stored procedures that return a result set.
>
> There is some support (dating back to Berkeley Postquel) for functions
> returning sets, but it's pretty ugly and limited.  Proper support might
> happen in 7.2 ...

Something to look forward to! Meanwhile I'll have a play and see if its
possible to use a read trigger
to populate a temporary table. hmm, that might require a statement level
trigger. Another thing for 7.2,
i guess.

The application programming we are doing now utilises stored procedures
returning record sets
(MSSQL) and the lack is showstopper in our migration plans. Sigh.


Thanks Tom

Regards


John




Re: Please advise features in 7.1

From
Philip Warner
Date:
At 18:00 23/11/00 +1300, John Huttley wrote:
>
>1. Calculated fields in table definitions . eg.
>

Can't really do this - you might want to consider a view with an insert &
update rule. I'm not sure how flexible rules are and you may not be able to
write rules to make views functions like tables, but that is at least part
of their purpose I think.


----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: Please advise features in 7.1

From
Don Baccus
Date:
At 06:00 PM 11/23/00 +1300, John Huttley wrote:

>1. Calculated fields in table definitions . eg.
>
>   Create table test (
>            A Integer,
>            B integer,
>           the_sum   As  (A+B),
>);

...

>These are _extraordinarily_ useful for application development.
>
>If anyone has a way of bolting on any of these to 7.0, I'd be keen to hear
from
>you.

Create a trigger on insert/update for this case...



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: Please advise features in 7.1

From
Tom Lane
Date:
"john huttley" <john@mwk.co.nz> writes:
>> We've had parameterized triggers for years.  Maybe you attach some
>> meaning to that term beyond what I do?

> I'm referring to the manual that says functions used for triggers must
> have no parameters and return a type Opaque.

The function has to be declared that way, but you can actually pass a
set of string parameters to it from the CREATE TRIGGER command.  The
strings show up in some special variable or other inside the function.
(No, I don't know why it was done in that ugly way...)  See the manual's
discussion of trigger programming.
        regards, tom lane


Please advise features in 7.1 (SUMMARY)

From
"John Huttley"
Date:
Thanks for your help, everyone.

This is a summary of replies.

1. Calculated fields in table definitions . eg.
   Create table test (            A Integer,            B integer,           the_sum   As  (A+B),
);

This functionality can be achieved through the use of views.
Implementing the create table syntax  may not be too hard,
but not in 7.1...

2  Parameterised Triggers

Functionality is there, just that the documentation gave the wrong implication.
An user manual example of using parameterised triggers to implement referential
integrity
would be welcome.

3. Stored Procedures returning a record set.

Dream on!


Regards

John




Re: Please advise features in 7.1 (SUMMARY)

From
Andrew Snow
Date:
On Tue, 28 Nov 2000, John Huttley wrote:

> 3. Stored Procedures returning a record set.
> 
> Dream on!

This is something I would be really interested to see working. What are the
issues?  my understanding is that it is technically feasible but too
complicated to add to PL/PGsql?  it seems to me a basic service that needs
to be implemented soon, even if its just returning multiple rows of one
column...


- Andrew




beta testing version

From
"xuyifeng"
Date:
Hi,
  how long is PG7.1 already in beta testing? can it be released before Christmas day?  can PG7.1  will recover database
fromsystem crash?
 
  Thanks,

XuYifeng



Re: beta testing version

From
Don Baccus
Date:
At 04:17 PM 11/28/00 +0800, xuyifeng wrote:
>Hi,
>
>   how long is PG7.1 already in beta testing? can it be released before Christmas day?
>   can PG7.1  will recover database from system crash?

This guy's a troll from the PHP Builder's site (at least, Tim Perdue and I suspect this
due to some posts he made in regard to Tim's SourceForge/Postgres article).

Since he's read Tim's article, and at least some of the follow-up posts (given that
he's posted responses himself), he should know by now that PG 7.1 is still in a pre-beta
state and won't be released before Christmas day.  I also posted a fairly long answer
to a question Tim's posted at phpbuilder.com regarding recoverability and this guy's
undoubtably read it, too.

Have I forgotten anything, xuyifeng?



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
"xuyifeng"
Date:
no doubt,  I have touched some problems PG has, right?  if PG is so good,  
is there any necessary for the team  to improve PG again?

Regards,
XuYifeng

----- Original Message ----- 
From: Don Baccus <dhogaza@pacifier.com>
To: xuyifeng <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org>
Sent: Tuesday, November 28, 2000 10:37 PM
Subject: Re: [HACKERS] beta testing version


> At 04:17 PM 11/28/00 +0800, xuyifeng wrote:
> >Hi,
> >
> >   how long is PG7.1 already in beta testing? can it be released before Christmas day?
> >   can PG7.1  will recover database from system crash?
> 
> This guy's a troll from the PHP Builder's site (at least, Tim Perdue and I suspect this
> due to some posts he made in regard to Tim's SourceForge/Postgres article).
> 
> Since he's read Tim's article, and at least some of the follow-up posts (given that
> he's posted responses himself), he should know by now that PG 7.1 is still in a pre-beta
> state and won't be released before Christmas day.  I also posted a fairly long answer
> to a question Tim's posted at phpbuilder.com regarding recoverability and this guy's
> undoubtably read it, too.
> 
> Have I forgotten anything, xuyifeng?
> 
> 
> 
> - Don Baccus, Portland OR <dhogaza@pacifier.com>
>   Nature photos, on-line guides, Pacific Northwest
>   Rare Bird Alert Service and other goodies at
>   http://donb.photo.net.
> 


Re: beta testing version

From
Don Baccus
Date:
At 11:15 PM 11/28/00 +0800, xuyifeng wrote:
>no doubt,  I have touched some problems PG has, right?  if PG is so good,  
>is there any necessary for the team  to improve PG again?

See?  Troll...

The guy worships MySQL, just in case folks haven't made the connection.

I'm going to ignore him from now on, suggest others do the same, I'm sure
he'll go away eventually.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
"xuyifeng"
Date:
you are complete wrong,  if I don't like PG, I'll never go here or talk anything about PG,  I don't care it.
I just want PG can be improved quickly, for me crash recover is very urgent problem,
otherewise PG is forced to stay on my desktop machine, We'll dare not  move it to our Server,
I always see myself as a customer,  customer is always right.

Regards,
XuYifeng


----- Original Message ----- 
From: Don Baccus <dhogaza@pacifier.com>
To: xuyifeng <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org>
Sent: Tuesday, November 28, 2000 11:16 PM
Subject: Re: [HACKERS] beta testing version


> At 11:15 PM 11/28/00 +0800, xuyifeng wrote:
> >no doubt,  I have touched some problems PG has, right?  if PG is so good,  
> >is there any necessary for the team  to improve PG again?
> 
> See?  Troll...
> 
> The guy worships MySQL, just in case folks haven't made the connection.
> 
> I'm going to ignore him from now on, suggest others do the same, I'm sure
> he'll go away eventually.
> 
> 
> 
> - Don Baccus, Portland OR <dhogaza@pacifier.com>
>   Nature photos, on-line guides, Pacific Northwest
>   Rare Bird Alert Service and other goodies at
>   http://donb.photo.net.
> 


Re: beta testing version

From
Thomas Lockhart
Date:
> no doubt,  I have touched some problems PG has, right?  if PG is so good,
> is there any necessary for the team  to improve PG again?

*rofl*

Good call Don :)
                    - Thomas


Re: Please advise features in 7.1 (SUMMARY)

From
"Ross J. Reedstrom"
Date:
On Tue, Nov 28, 2000 at 02:04:01PM +1300, John Huttley wrote:
> Thanks for your help, everyone.
> 
> This is a summary of replies.
> 
> 1. Calculated fields in table definitions . eg.
> 
>     Create table test (
>              A Integer,
>              B integer,
>             the_sum   As  (A+B),
> );
> 
> This functionality can be achieved through the use of views.

Using a view for this isn't quite the same functionality as a computed
field, from what I understand, since the calculation will be done at
SELECT time, rather than INSERT/UPDATE.

This can also be done with a trigger, which, while more cumbersome to
write, would be capable of doing the math at modification time.

Ross
-- 
Open source code is like a natural resource, it's the result of providing
food and sunshine to programmers, and then staying out of their way.
[...] [It] is not going away because it has utility for both the developers 
and users independent of economic motivations.  Jim Flynn, Sunnyvale, Calif.


Re: beta testing version

From
Hannu Krosing
Date:
xuyifeng wrote:
> 

I just noticed this conversation so I have not followed all of it, 
but you seem to have strange priorities

> I just want PG can be improved quickly, for me crash recover is very urgent problem,

Crash avoidance is usually much more urgent, at least on production
servers.

> otherewise PG is forced to stay on my desktop machine, We'll dare not  move it to our Server,

Why do you keep crashing your server ?

If your desktop crashes less often than your server you might exchange
them, no?

> I always see myself as a customer,  customer is always right.

I'd like to see myself as being always right too ;)

-------------------
Hannu


Re: beta testing version

From
The Hermit Hacker
Date:
On Tue, 28 Nov 2000, xuyifeng wrote:

> no doubt,  I have touched some problems PG has, right?  if PG is so good,  
> is there any necessary for the team  to improve PG again?

There is always room for improvements for any software package ... whether
it be PgSQL, Linux, FreeBSD or PHPBuilder ... as ppl learn more,
understand more and come up with new techniques, things tend to get better
...
> > Regards,
> XuYifeng
> 
> ----- Original Message ----- 
> From: Don Baccus <dhogaza@pacifier.com>
> To: xuyifeng <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org>
> Sent: Tuesday, November 28, 2000 10:37 PM
> Subject: Re: [HACKERS] beta testing version
> 
> 
> > At 04:17 PM 11/28/00 +0800, xuyifeng wrote:
> > >Hi,
> > >
> > >   how long is PG7.1 already in beta testing? can it be released before Christmas day?
> > >   can PG7.1  will recover database from system crash?
> > 
> > This guy's a troll from the PHP Builder's site (at least, Tim Perdue and I suspect this
> > due to some posts he made in regard to Tim's SourceForge/Postgres article).
> > 
> > Since he's read Tim's article, and at least some of the follow-up posts (given that
> > he's posted responses himself), he should know by now that PG 7.1 is still in a pre-beta
> > state and won't be released before Christmas day.  I also posted a fairly long answer
> > to a question Tim's posted at phpbuilder.com regarding recoverability and this guy's
> > undoubtably read it, too.
> > 
> > Have I forgotten anything, xuyifeng?
> > 
> > 
> > 
> > - Don Baccus, Portland OR <dhogaza@pacifier.com>
> >   Nature photos, on-line guides, Pacific Northwest
> >   Rare Bird Alert Service and other goodies at
> >   http://donb.photo.net.
> > 
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: beta testing version

From
The Hermit Hacker
Date:
On Tue, 28 Nov 2000, xuyifeng wrote:

> you are complete wrong, if I don't like PG, I'll never go here or talk
> anything about PG, I don't care it. I just want PG can be improved
> quickly, for me crash recover is very urgent problem, otherewise PG is
> forced to stay on my desktop machine, We'll dare not move it to our
> Server, I always see myself as a customer, customer is always right.

except when they are wrong ...

... but, as for crash recover, the plan right now is that on Thursday, Dec
1st, 7.1 goes beta ... if you are so keen on the crash recovery stuff,
what I'd recommend is grab the snapshot, and work with that on your
machine, get used to the features that it presents and report any bugs you
find.  Between beta and release, there will be bug fixes, but no features
added, so it makes for a relatively safe starting point.  I wouldn't use
it in production (or, rather, I personally would, but it isn't something
I'd recommend for the faint of heart), but it will give you a base to
start from ...

release will be shortly into the new year, depending on what sorts of bugs
ppl report and how quickly they can be fixed ... if all goes well, Jan 1st
will be release date, but, from experience, we're looking at closer to jan
15th :)
> > Regards,
> XuYifeng
> 
> 
> ----- Original Message ----- 
> From: Don Baccus <dhogaza@pacifier.com>
> To: xuyifeng <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org>
> Sent: Tuesday, November 28, 2000 11:16 PM
> Subject: Re: [HACKERS] beta testing version
> 
> 
> > At 11:15 PM 11/28/00 +0800, xuyifeng wrote:
> > >no doubt,  I have touched some problems PG has, right?  if PG is so good,  
> > >is there any necessary for the team  to improve PG again?
> > 
> > See?  Troll...
> > 
> > The guy worships MySQL, just in case folks haven't made the connection.
> > 
> > I'm going to ignore him from now on, suggest others do the same, I'm sure
> > he'll go away eventually.
> > 
> > 
> > 
> > - Don Baccus, Portland OR <dhogaza@pacifier.com>
> >   Nature photos, on-line guides, Pacific Northwest
> >   Rare Bird Alert Service and other goodies at
> >   http://donb.photo.net.
> > 
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: beta testing version

From
The Hermit Hacker
Date:
On Tue, 28 Nov 2000, Hannu Krosing wrote:

> xuyifeng wrote:
> > 
> 
> I just noticed this conversation so I have not followed all of it, 
> but you seem to have strange priorities
> 
> > I just want PG can be improved quickly, for me crash recover is very urgent problem,
> 
> Crash avoidance is usually much more urgent, at least on production
> servers.

Good call, but I kinda jumped to the conclusion that since PgSQL itself
isn't that crash prone, its his OS or his hardware that was the problem :0




Re: beta testing version

From
"Mitch Vincent"
Date:
This is one of the not-so-stomped boxes running PostgreSQL -- I've never
restarted PostgreSQL on it since it was installed.

12:03pm  up 122 days,  7:54,  1 user,  load average: 0.08, 0.11, 0.09

I had some index corruption problems in 6.5.3 but since 7.0.X I haven't
heard so much as a peep from any PostgreSQL backend. It's superbly stable on
all my machines..

Damn good work guys.

-Mitch

----- Original Message -----
From: "The Hermit Hacker" <scrappy@hub.org>
To: "Hannu Krosing" <hannu@tm.ee>
Cc: "xuyifeng" <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org>;
"Don Baccus" <dhogaza@pacifier.com>
Sent: Tuesday, November 28, 2000 8:53 AM
Subject: Re: [HACKERS] beta testing version


> On Tue, 28 Nov 2000, Hannu Krosing wrote:
>
> > xuyifeng wrote:
> > >
> >
> > I just noticed this conversation so I have not followed all of it,
> > but you seem to have strange priorities
> >
> > > I just want PG can be improved quickly, for me crash recover is very
urgent problem,
> >
> > Crash avoidance is usually much more urgent, at least on production
> > servers.
>
> Good call, but I kinda jumped to the conclusion that since PgSQL itself
> isn't that crash prone, its his OS or his hardware that was the problem :0
>
>
>



Re: beta testing version

From
Ron Chmara
Date:
Mitch Vincent wrote:
> 
> This is one of the not-so-stomped boxes running PostgreSQL -- I've never
> restarted PostgreSQL on it since it was installed.
> 12:03pm  up 122 days,  7:54,  1 user,  load average: 0.08, 0.11, 0.09
> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't
> heard so much as a peep from any PostgreSQL backend. It's superbly stable on
> all my machines..

I have a 6.5.x box at 328 days of active use.

Crash "recovery" seems silly to me. :-)

-Bop

--
Brought to you from boop!, the dual boot Linux/Win95 Compaq Presario 1625
laptop, currently running RedHat 6.1. Your bopping may vary.


Re: beta testing version

From
Don Baccus
Date:
At 03:25 PM 11/28/00 -0700, Ron Chmara wrote:
>Mitch Vincent wrote:
>> 
>> This is one of the not-so-stomped boxes running PostgreSQL -- I've never
>> restarted PostgreSQL on it since it was installed.
>> 12:03pm  up 122 days,  7:54,  1 user,  load average: 0.08, 0.11, 0.09
>> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't
>> heard so much as a peep from any PostgreSQL backend. It's superbly stable on
>> all my machines..
>
>I have a 6.5.x box at 328 days of active use.
>
>Crash "recovery" seems silly to me. :-)

Well, not really ... but since our troll is a devoted MySQL user, it's a bit
of a red-herring anyway, at least as regards his own server.

You know, the one he's afraid to put Postgres on, but sleeps soundly at
night knowing the mighty bullet-proof MySQL with its full transaction
semantics, archive logging and recovery from REDO logs and all that
will save him? :)

Again ... he's a troll, not even a very entertaining one.




- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
The Hermit Hacker
Date:
On Tue, 28 Nov 2000, Don Baccus wrote:

> At 03:25 PM 11/28/00 -0700, Ron Chmara wrote:
> >Mitch Vincent wrote:
> >> 
> >> This is one of the not-so-stomped boxes running PostgreSQL -- I've never
> >> restarted PostgreSQL on it since it was installed.
> >> 12:03pm  up 122 days,  7:54,  1 user,  load average: 0.08, 0.11, 0.09
> >> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't
> >> heard so much as a peep from any PostgreSQL backend. It's superbly stable on
> >> all my machines..
> >
> >I have a 6.5.x box at 328 days of active use.
> >
> >Crash "recovery" seems silly to me. :-)
> 
> Well, not really ... but since our troll is a devoted MySQL user, it's a bit
> of a red-herring anyway, at least as regards his own server.
> 
> You know, the one he's afraid to put Postgres on, but sleeps soundly at
> night knowing the mighty bullet-proof MySQL with its full transaction
> semantics, archive logging and recovery from REDO logs and all that
> will save him? :)
> 
> Again ... he's a troll, not even a very entertaining one.

Or informed?




Re: beta testing version

From
"xuyifeng"
Date:
NO, I just tested how solid PgSQL  is,  I run a program busy inserting record into PG table,  when I 
suddenly pulled out power from my machine and  restarted PG, I can not insert any record into database
table,  all backends are dead without any respone (not core dump), note that I am using FreeBSD 4.2, 
it's rock solid,  it's not OS crash, it just losted power.  We use WindowsNT and MSSQL on our production
server,  before we accept MSSQL, we use this method to test if MSSQL can endure this kind of strik,
it's OK, all databases are safely recovered, we can continue our work. we are a stock exchange company,
our server are storing millilion $ finance number, we don't hope there are any problems in this case, 
we are using UPS,  but UPS is not everything,  it you bet everything on UPS, you must be idiot. 
I know you must be an avocation of PG, but we are professional customer, corporation user, we store critical
data into database, not your garbage data.

Regards,
XuYifeng

----- Original Message ----- 
From: Don Baccus <dhogaza@pacifier.com>
To: Ron Chmara <ron@opus1.com>; Mitch Vincent <mitch@venux.net>; <pgsql-hackers@postgresql.org>
Sent: Wednesday, November 29, 2000 6:58 AM
Subject: Re: [HACKERS] beta testing version


> At 03:25 PM 11/28/00 -0700, Ron Chmara wrote:
> >Mitch Vincent wrote:
> >> 
> >> This is one of the not-so-stomped boxes running PostgreSQL -- I've never
> >> restarted PostgreSQL on it since it was installed.
> >> 12:03pm  up 122 days,  7:54,  1 user,  load average: 0.08, 0.11, 0.09
> >> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't
> >> heard so much as a peep from any PostgreSQL backend. It's superbly stable on
> >> all my machines..
> >
> >I have a 6.5.x box at 328 days of active use.
> >
> >Crash "recovery" seems silly to me. :-)
> 
> Well, not really ... but since our troll is a devoted MySQL user, it's a bit
> of a red-herring anyway, at least as regards his own server.
> 
> You know, the one he's afraid to put Postgres on, but sleeps soundly at
> night knowing the mighty bullet-proof MySQL with its full transaction
> semantics, archive logging and recovery from REDO logs and all that
> will save him? :)
> 
> Again ... he's a troll, not even a very entertaining one.
> 
> 
> 
> 
> - Don Baccus, Portland OR <dhogaza@pacifier.com>
>   Nature photos, on-line guides, Pacific Northwest
>   Rare Bird Alert Service and other goodies at
>   http://donb.photo.net.
> 


Re: beta testing version

From
Nathan Myers
Date:
On Wed, Nov 29, 2000 at 09:59:34AM +0800, xuyifeng wrote:
> NO, I just tested how solid PgSQL is, I run a program busy inserting
> record into PG table, when I suddenly pulled out power from my machine ...

Nobody claims PostgreSQL is proof against power failures.

> ... We use WindowsNT and MSSQL on our production server,
> before we accept MSSQL, we use this method to test if MSSQL can endure
> this kind of strike, it's OK, all databases are safely recovered, we
> can continue our work. 

You got lucky.  Period.  MSSQL is not proof against power failures,
and neither is NTFS.  In particular, that the database accepted 
transactions afterward is far from proof that its files were not 
corrupted.

Incompetent testers produce invalid tests.  Invalid tests lead to 
meaningless conclusions.  Incompetent testers' employers suffer
from false confidence, and poor decision-making.

Nathan Myers
ncm@zembu.com



Re: beta testing version

From
Hannu Krosing
Date:
xuyifeng wrote:
> 
> NO, I just tested how solid PgSQL  is,  I run a program busy inserting record into PG table,  when I
> suddenly pulled out power from my machine and  restarted PG, I can not insert any record into database
> table,  all backends are dead without any respone (not core dump), note that I am using FreeBSD 4.2,
> it's rock solid,  it's not OS crash, it just losted power.  We use WindowsNT and MSSQL on our production
> server,  before we accept MSSQL, we use this method to test if MSSQL can endure this kind of strik,
> it's OK, all databases are safely recovered, we can continue our work.

The only way to safely recover them after a major crash would be
manual/supervised recovery from backups + logs

As not even NTFS is safe from power failures (I have lost an NTFS file
system a few times due to not 
having an UPS) it is irrelevant if MSSQL is. Even if MSSQL is "crash
proof" (tm), how can you _prove_ 
your customers/superiors that the last N minutes of transactions were
not lost ? 

If the DB is able to "continue your work" after the crash, you can of
course cover up the fact that the 
crash even happened and blame the lost transactions on someone else when
they surface at the next audit ;)

Or just claim thet computer technology is so complicated that losing a
few transactions is normal - but 
you could go on working ;) :~) ;-p

What you want for mission-critical data is replicated databases or at
least off-site logging, not "crash 
recovery" at some arbitrarily chosen layer. You will need to recover
from the crash even if it destroys 
the whole computer.

May I suggest another test for your NT/MSSQL setup - dont pull the plug
but change the input voltage 
to 10 000 VAC, if this goes well, test vith 100 000 VAC ;)
This is also a scenario much less likely to be protected by an UPS than
power loss.

> we are a stock exchange company,
> our server are storing millilion $ finance number, we don't hope there are any problems in this case,
> we are using UPS,  but UPS is not everything,  it you bet everything on UPS, you must be idiot.

So are you, if you bet everything on hoping that DB will do crash
recovery from any type of crash.

A common case of "crash" that may need to be recovered from is also a
human error , like typing drop database 
at the wrong console;

> I know you must be an avocation of PG, but we are professional customer, corporation user, we store critical
> data into database, not your garbage data.

Then you'd better have a crash recovery infrastructure/procedures in
place and not hope that DB server 
will do that automatically for you

--------------------
Hannu


Re: beta testing version

From
"Magnus Naeslund\(f\)"
Date:
I don't have the same luck, sorry to say!

I am running Mandrake linux with OpenWall patched 2.2.17 kernel, dual p3
550Mhz, 1gb memory.
It's a really busy webserver that constantly is running with 10 in load.
Sometime it spikes to ~40-50 in load (the most we had was 114(!)).
I am running postgresql 7.0.2 (from the Mandrake rpm's).

One problem i have is that in one database we rapidly insert/delete in some
tables, and to maintain a good performance on that db, i have to run a
vacuum every hour(!).
I think that db has excessive indexes all over the place (if that could have
anything to do with it?).

Another other problem that is more severe is that the database "crashes"
(read: stops working), if i run psql and do a select it says
"001129.07:04:15.688 [25474] FATAL 1:  Memory exhausted in AllocSetAlloc()"
and fails.
I have a cron script that watches postgres, and restarts it if it cant get a
select right.
It fails this way maybe once a day or two days.
I've searched the mailinglist archives for this problem, but it allways
seems that my problem doesn't fit the descriptions of the other ppl's
problem generating this error message.

I have not found the right time to upgrade to 7.0.3 yet, and i don't know if
that would solve anything.

Another problem i have is that i get "001128.12:58:01.248 [23444] FATAL 1:
Socket command type  unknown" in my logs. I don't know if i get that from
the unix odbc driver, the remote windows odbc driver, or in unix standard db
connections.

I get "pq_recvbuf: unexpected EOF on client connection" alot too, but that i
think only indicates that the socket was closed in a not-so-nice way, and
that it is no "real" error.
It seems that the psql windows odbc driver is generating this.

The postmaster is running with these parameters: "-N 512 -B 1024 -i -o -S
4096"

But as a happy note i can tell you that we have a Linux box here (pentium
100, kernel 2.0.3x) that has near 1000 days uptime, and runs postgres 6.5.x.
It has never failed, not even a single time :)

Magnus Naeslund

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-Programmer/Networker [|] Magnus NaeslundPGP Key:
http://www.genline.nu/mag_pgp.txt
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


----- Original Message -----
From: "Mitch Vincent" <mitch@venux.net>
To: <pgsql-hackers@postgresql.org>
Sent: Tuesday, November 28, 2000 19:12
Subject: Re: [HACKERS] beta testing version


> This is one of the not-so-stomped boxes running PostgreSQL -- I've never
> restarted PostgreSQL on it since it was installed.
>
> 12:03pm  up 122 days,  7:54,  1 user,  load average: 0.08, 0.11, 0.09
>
> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't
> heard so much as a peep from any PostgreSQL backend. It's superbly stable
on
> all my machines..
>
> Damn good work guys.
>
> -Mitch
>
> ----- Original Message -----
> From: "The Hermit Hacker" <scrappy@hub.org>
> To: "Hannu Krosing" <hannu@tm.ee>
> Cc: "xuyifeng" <jamexu@telekbird.com.cn>; <pgsql-hackers@postgresql.org>;
> "Don Baccus" <dhogaza@pacifier.com>
> Sent: Tuesday, November 28, 2000 8:53 AM
> Subject: Re: [HACKERS] beta testing version
>
>
> > On Tue, 28 Nov 2000, Hannu Krosing wrote:
> >
> > > xuyifeng wrote:
> > > >
> > >
> > > I just noticed this conversation so I have not followed all of it,
> > > but you seem to have strange priorities
> > >
> > > > I just want PG can be improved quickly, for me crash recover is very
> urgent problem,
> > >
> > > Crash avoidance is usually much more urgent, at least on production
> > > servers.
> >
> > Good call, but I kinda jumped to the conclusion that since PgSQL itself
> > isn't that crash prone, its his OS or his hardware that was the problem
:0
> >
> >
> >
>
>



Re: beta testing version

From
"Horst Herb"
Date:
> server,  before we accept MSSQL, we use this method to test if MSSQL can
endure this kind of strik,
> it's OK, all databases are safely recovered, we can continue our work. we
are a stock exchange company,

And how exactly did you test the integrity of your data? Unless every single
record has got at least a CRC stored somewhere, you won't be able AT ALL to
check for database integrity. The reports from NTFS and MSSQL internal
checking are meaningless for your data integrity.

We are doing this checksumming in our project, and already got a few nasty
surprises when the "CRC daemon" stumbled over a few corrupted records we
never would have discovered otherwise. Exactly this checksumming weeded out
our server alternatives; at present only PostgreSQL is left, was the most
reliable of all.

Horst



Re: beta testing version

From
Tom Lane
Date:
"Magnus Naeslund\(f\)" <mag@fbab.net> writes:
> Another other problem that is more severe is that the database "crashes"
> (read: stops working), if i run psql and do a select it says
> "001129.07:04:15.688 [25474] FATAL 1:  Memory exhausted in AllocSetAlloc()"
> and fails.

That's odd.  Does any select at all --- even, say, "SELECT 2+2" --- fail
like that, or just ones referencing a particular table, or maybe you
meant just one specific query?

> Another problem i have is that i get "001128.12:58:01.248 [23444] FATAL 1:
> Socket command type  unknown" in my logs. I don't know if i get that from
> the unix odbc driver, the remote windows odbc driver, or in unix standard db
> connections.

Do any of your client applications complain that they're being
disconnected on?  This might come from something not doing disconnection
cleanly, in which case the client probably wouldn't notice anything wrong.

> I get "pq_recvbuf: unexpected EOF on client connection" alot too, but that i
> think only indicates that the socket was closed in a not-so-nice way, and
> that it is no "real" error.
> It seems that the psql windows odbc driver is generating this.

That message is quite harmless AFAIK, although it'd be nice to clean up
the ODBC driver so that it disconnects in the approved fashion.
        regards, tom lane


Re: Please advise features in 7.1 (SUMMARY)

From
"xuyifeng"
Date:
Is "if" clause support in PG? 
for example:
"drop table aa if exist"
"insert into aa values(1) if not exists select * from aa where i=1"

I would like PG support it.
---
XuYifeng

----- Original Message ----- 
From: John Huttley <John@mwk.co.nz>
To: <pgsql-hackers@postgresql.org>
Sent: Tuesday, November 28, 2000 9:04 AM
Subject: [HACKERS] Please advise features in 7.1 (SUMMARY)


> Thanks for your help, everyone.
> 
> This is a summary of replies.
> 
> 1. Calculated fields in table definitions . eg.
> 
>     Create table test (
>              A Integer,
>              B integer,
>             the_sum   As  (A+B),
> );
> 
> This functionality can be achieved through the use of views.
> Implementing the create table syntax  may not be too hard,
> but not in 7.1...
> 
> 2  Parameterised Triggers
> 
> Functionality is there, just that the documentation gave the wrong implication.
> An user manual example of using parameterised triggers to implement referential
> integrity
> would be welcome.
> 
> 3. Stored Procedures returning a record set.
> 
> Dream on!
> 
> 
> Regards
> 
> John
> 
> 
> 


Re: beta testing version

From
Alain Toussaint
Date:
> our server alternatives; at present only PostgreSQL is left, was the most
> reliable of all.

mind i ask on which platform (Operating system) did you do your test,i'm
mostly used to linux but after i paid my computer (still 5 month
remaining),i want to get a used SGI box from reputable system and put NetBSD
as well as PostgreSQL on it (and maybe AolServer too,depending on the
threading model of NetBSD).

Alain Toussaint



Re: Please advise features in 7.1 (SUMMARY)

From
Thomas Lockhart
Date:
> Is "if" clause support in PG?
> for example:
> "drop table aa if exist"
> "insert into aa values(1) if not exists select * from aa where i=1"

No. afaict it is not in any SQL standard, so is unlikely to get much
attention from developers.
                    - Thomas


Re: Please advise features in 7.1 (SUMMARY)

From
Andrew Snow
Date:
On Thu, 30 Nov 2000, Thomas Lockhart wrote:

> > Is "if" clause support in PG?
> > for example:
> > "drop table aa if exist"
> > "insert into aa values(1) if not exists select * from aa where i=1"
> 
> No. afaict it is not in any SQL standard, so is unlikely to get much
> attention from developers.

Plus, for that second one can't you just do:

INSERT INTO aa SELECT 1 WHERE NOT EXISTS (SELECT * FROM aa WHERE i=1);


- Andrew




Re: Please advise features in 7.1 (SUMMARY)

From
Don Baccus
Date:
At 05:24 AM 11/30/00 +0000, Thomas Lockhart wrote:
>> Is "if" clause support in PG?
>> for example:
>> "drop table aa if exist"
>> "insert into aa values(1) if not exists select * from aa where i=1"
>
>No. afaict it is not in any SQL standard, so is unlikely to get much
>attention from developers.

The insert, at least, can be written in standard SQL anyway...



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
The Hermit Hacker
Date:
v7.1 should improve crash recovery for situations like this ... you'll
still have to do a recovery of the data on corruption of this magnitude,
but at least with the WAL stuff that Vadim is producing, you'll be able to
recover up until the point that the power cable was pulled out of the wall
...


On Wed, 29 Nov 2000, xuyifeng wrote:

> NO, I just tested how solid PgSQL  is,  I run a program busy inserting record into PG table,  when I 
> suddenly pulled out power from my machine and  restarted PG, I can not insert any record into database
> table,  all backends are dead without any respone (not core dump), note that I am using FreeBSD 4.2, 
> it's rock solid,  it's not OS crash, it just losted power.  We use WindowsNT and MSSQL on our production
> server,  before we accept MSSQL, we use this method to test if MSSQL can endure this kind of strik,
> it's OK, all databases are safely recovered, we can continue our work. we are a stock exchange company,
> our server are storing millilion $ finance number, we don't hope there are any problems in this case, 
> we are using UPS,  but UPS is not everything,  it you bet everything on UPS, you must be idiot. 
> I know you must be an avocation of PG, but we are professional customer, corporation user, we store critical
> data into database, not your garbage data.
> 
> Regards,
> XuYifeng
> 
> ----- Original Message ----- 
> From: Don Baccus <dhogaza@pacifier.com>
> To: Ron Chmara <ron@opus1.com>; Mitch Vincent <mitch@venux.net>; <pgsql-hackers@postgresql.org>
> Sent: Wednesday, November 29, 2000 6:58 AM
> Subject: Re: [HACKERS] beta testing version
> 
> 
> > At 03:25 PM 11/28/00 -0700, Ron Chmara wrote:
> > >Mitch Vincent wrote:
> > >> 
> > >> This is one of the not-so-stomped boxes running PostgreSQL -- I've never
> > >> restarted PostgreSQL on it since it was installed.
> > >> 12:03pm  up 122 days,  7:54,  1 user,  load average: 0.08, 0.11, 0.09
> > >> I had some index corruption problems in 6.5.3 but since 7.0.X I haven't
> > >> heard so much as a peep from any PostgreSQL backend. It's superbly stable on
> > >> all my machines..
> > >
> > >I have a 6.5.x box at 328 days of active use.
> > >
> > >Crash "recovery" seems silly to me. :-)
> > 
> > Well, not really ... but since our troll is a devoted MySQL user, it's a bit
> > of a red-herring anyway, at least as regards his own server.
> > 
> > You know, the one he's afraid to put Postgres on, but sleeps soundly at
> > night knowing the mighty bullet-proof MySQL with its full transaction
> > semantics, archive logging and recovery from REDO logs and all that
> > will save him? :)
> > 
> > Again ... he's a troll, not even a very entertaining one.
> > 
> > 
> > 
> > 
> > - Don Baccus, Portland OR <dhogaza@pacifier.com>
> >   Nature photos, on-line guides, Pacific Northwest
> >   Rare Bird Alert Service and other goodies at
> >   http://donb.photo.net.
> > 
> 

Marc G. Fournier                   ICQ#7615664               IRC Nick: Scrappy
Systems Administrator @ hub.org 
primary: scrappy@hub.org           secondary: scrappy@{freebsd|postgresql}.org 



Re: beta testing version

From
Don Baccus
Date:
At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote:
>
>v7.1 should improve crash recovery for situations like this ... you'll
>still have to do a recovery of the data on corruption of this magnitude,
>but at least with the WAL stuff that Vadim is producing, you'll be able to
>recover up until the point that the power cable was pulled out of the wall

No, WAL won't help if an actual database file is corrupted, say by a
disk drive hosing a block or portion thereof with zeros.  WAL-based
recovery at startup works on an intact database.

Still, in the general case you need real backup and recovery tools.
Then you can apply archives of REDOs to a backup made of a snapshot
and rebuild up to the last transaction.   As opposed to your last
pg_dump.

So what about mirroring (RAID 1)?  As the docs tell ya, that protects
you against one drive failing but not against power failure, which can
cause bad data to be written to both mirrors if both are actively 
writing when the plug is pulled.

Power failures are evil, face it! :)



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
The Hermit Hacker
Date:
On Thu, 30 Nov 2000, Don Baccus wrote:

> At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote:
> >
> >v7.1 should improve crash recovery for situations like this ... you'll
> >still have to do a recovery of the data on corruption of this magnitude,
> >but at least with the WAL stuff that Vadim is producing, you'll be able to
> >recover up until the point that the power cable was pulled out of the wall
> 
> No, WAL won't help if an actual database file is corrupted, say by a
> disk drive hosing a block or portion thereof with zeros.  WAL-based
> recovery at startup works on an intact database.

No, WAL does help, cause you can then pull in your last dump and recover
up to the moment that power cable was pulled out of the wall ...




Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Thu, Nov 30, 2000 at 07:47:08PM -0400, The Hermit Hacker wrote:
> On Thu, 30 Nov 2000, Don Baccus wrote:
> > At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote:
> > >
> > >v7.1 should improve crash recovery for situations like this ... you'll
> > >still have to do a recovery of the data on corruption of this magnitude,
> > >but at least with the WAL stuff that Vadim is producing, you'll be able to
> > >recover up until the point that the power cable was pulled out of the wall
> > 
> > No, WAL won't help if an actual database file is corrupted, say by a
> > disk drive hosing a block or portion thereof with zeros.  WAL-based
> > recovery at startup works on an intact database.
> 
> No, WAL does help, cause you can then pull in your last dump and recover
> up to the moment that power cable was pulled out of the wall ...

False, on so many counts I can't list them all.

Nathan Myers
ncm



Re: beta testing version

From
"Mitch Vincent"
Date:
> > No, WAL does help, cause you can then pull in your last dump and recover
> > up to the moment that power cable was pulled out of the wall ...
>
> False, on so many counts I can't list them all.

Why? If we're not talking hardware damage and you have a dump made sometime
previous to the crash, why wouldn't that work to restore the database? I've
had to restore a corrupted database from a dump before, there wasn't any
hardware damage, the database (more specifically the indexes) were
corrupted. Of course WAL wasn't around but I don't see why this wouldn't
work...

Note I'm not saying you're wrong, just asking that you explain your comment
a little more. If WAL can't be used to help recover from crashes where
database corruption occurs, what good is it?
-Mitch



Re: beta testing version

From
The Hermit Hacker
Date:
On Thu, 30 Nov 2000, Nathan Myers wrote:

> On Thu, Nov 30, 2000 at 07:47:08PM -0400, The Hermit Hacker wrote:
> > On Thu, 30 Nov 2000, Don Baccus wrote:
> > > At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote:
> > > >
> > > >v7.1 should improve crash recovery for situations like this ... you'll
> > > >still have to do a recovery of the data on corruption of this magnitude,
> > > >but at least with the WAL stuff that Vadim is producing, you'll be able to
> > > >recover up until the point that the power cable was pulled out of the wall
> > > 
> > > No, WAL won't help if an actual database file is corrupted, say by a
> > > disk drive hosing a block or portion thereof with zeros.  WAL-based
> > > recovery at startup works on an intact database.
> > 
> > No, WAL does help, cause you can then pull in your last dump and recover
> > up to the moment that power cable was pulled out of the wall ...
> 
> False, on so many counts I can't list them all.

would love to hear them ... I'm always opening to having my
misunderstandings corrected ...



Re: beta testing version

From
Vince Vielhaber
Date:
On Thu, 30 Nov 2000, Nathan Myers wrote:

> On Thu, Nov 30, 2000 at 07:47:08PM -0400, The Hermit Hacker wrote:
> > On Thu, 30 Nov 2000, Don Baccus wrote:
> > > At 07:02 PM 11/30/00 -0400, The Hermit Hacker wrote:
> > > >
> > > >v7.1 should improve crash recovery for situations like this ... you'll
> > > >still have to do a recovery of the data on corruption of this magnitude,
> > > >but at least with the WAL stuff that Vadim is producing, you'll be able to
> > > >recover up until the point that the power cable was pulled out of the wall
> > >
> > > No, WAL won't help if an actual database file is corrupted, say by a
> > > disk drive hosing a block or portion thereof with zeros.  WAL-based
> > > recovery at startup works on an intact database.
> >
> > No, WAL does help, cause you can then pull in your last dump and recover
> > up to the moment that power cable was pulled out of the wall ...
>
> False, on so many counts I can't list them all.

*YAWN*





Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Thu, Nov 30, 2000 at 05:37:58PM -0800, Mitch Vincent wrote:
> > > No, WAL does help, cause you can then pull in your last dump and recover
> > > up to the moment that power cable was pulled out of the wall ...
> >
> > False, on so many counts I can't list them all.
> 
> Why? If we're not talking hardware damage and you have a dump made
> sometime previous to the crash, why wouldn't that work to restore the
> database? I've had to restore a corrupted database from a dump before,
> there wasn't any hardware damage, the database (more specifically the
> indexes) were corrupted. Of course WAL wasn't around but I don't see
> why this wouldn't work...

I posted a more detailed explanation a few minutes ago, but
it appears to have been eaten by the mailing list server.

I won't re-post the explanations that you all have seen over the 
last two days, about disk behavior during a power outage; they're 
in the archives (I assume -- when last I checked, web access to it 
didn't work).  Suffice to say that if you pull the plug, there is 
just too much about the state of the disks that is unknown.

As for replaying logs against a restored snapshot dump... AIUI, a 
dump records tuples by OID, but the WAL refers to TIDs.  Therefore, 
the WAL won't work as a re-do log to recover your transactions 
because the TIDs of the restored tables are all different.   

To get replaying we need an "update log", something that might be
in 7.2 if somebody does a lot of work.

> Note I'm not saying you're wrong, just asking that you explain your
> comment a little more. If WAL can't be used to help recover from
> crashes where database corruption occurs, what good is it?

The WAL is a performance optimization for the current recovery
capabilities, which assume uncorrupted table files.  It protects
against those database server crashes that happen not to corrupt 
the table files (i.e. most).  It doesn't protect against corruption 
of the tables, by bugs in PG or in the OS or from "hardware events".  
It also doesn't protect against OS crashes that result in 
write-buffered sectors not having been written before the crash.  
Practically, this means that WAL file entries older than a few 
seconds are not useful for much.

In general, it's foolish to expect a single system to store very
valuable data with much confidence.  To get full recoverability, 
you need a "hot failover" system duplicating your transactions in 
real time.  (Even then, you're vulnerable to application-level 
mistakes.)

Nathan Myers
ncm@zembu.com



Re: beta testing version

From
Don Baccus
Date:
At 05:15 PM 11/30/00 -0800, Nathan Myers wrote:

>As for replaying logs against a restored snapshot dump... AIUI, a 
>dump records tuples by OID, but the WAL refers to TIDs.  Therefore, 
>the WAL won't work as a re-do log to recover your transactions 
>because the TIDs of the restored tables are all different.   

Actually, the dump doesn't record tuple OIDs (unless you specifically
ask for them), it just dumps source sql.  When this gets reloaded
you get an equivalent database, but not the same database, that you
started out with.

That's why I've presumed you can't run the WAL against it.

If you and I are wrong I'd love to be surprised!



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote:
> 
> v7.1 should improve crash recovery ...
> ... with the WAL stuff that Vadim is producing, you'll be able to
> recover up until the point that the power cable was pulled out of 
> the wall.

Please do not propagate falsehoods like the above.  It creates
unsatisfiable expectations, and leads people to fail to take
proper precautions and recovery procedures.  

After a power outage on an active database, you may have corruption
at low levels of the system, and unless you have enormous redundancy
(and actually use it to verify everything) the corruption may go 
undetected and result in (subtly) wrong answers at any future time.

The logging in 7.1 protects transactions against many sources of 
database crash, but not necessarily against OS crash, and certainly
not against power failure.  (You might get lucky, or you might just 
think you were lucky.)  This is the same as for most databases; an
embedded database that talks directly to the hardware might be able
to do better.  

Nathan Myers
ncm@zembu.com


Re: beta testing version

From
The Hermit Hacker
Date:
On Thu, 30 Nov 2000, Nathan Myers wrote:

> On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote:
> > 
> > v7.1 should improve crash recovery ...
> > ... with the WAL stuff that Vadim is producing, you'll be able to
> > recover up until the point that the power cable was pulled out of 
> > the wall.
> 
> Please do not propagate falsehoods like the above.  It creates
> unsatisfiable expectations, and leads people to fail to take
> proper precautions and recovery procedures.  
> 
> After a power outage on an active database, you may have corruption
> at low levels of the system, and unless you have enormous redundancy
> (and actually use it to verify everything) the corruption may go 
> undetected and result in (subtly) wrong answers at any future time.
> 
> The logging in 7.1 protects transactions against many sources of 
> database crash, but not necessarily against OS crash, and certainly
> not against power failure.  (You might get lucky, or you might just 
> think you were lucky.)  This is the same as for most databases; an
> embedded database that talks directly to the hardware might be able
> to do better.  

We're talking about transaction logging here ... nothing gets written to
it until completed ... if I take a "known to be clean" backup from the
night before, restore that and then run through the transaction logs, my
data should be clean, unless my tape itself is corrupt.  If the power goes
off half way through a write to the log, then that transaction wouldn't be
marked as completed and won't roll into the restore ...

if a disk goes corrupt, I'd expect that the redo log would possibly have a
problem with corruption .. but if I pull the plug, unless I've somehow
damaged the disk, I would expect my redo log to be clean *and*, unless
Vadim totally messed something up, if there is any corruption in the redo
log, I'd expect that restoring from it would generate from red flags ...





Re: beta testing version

From
Alex Pilosov
Date:
On Thu, 30 Nov 2000, Nathan Myers wrote:

> After a power outage on an active database, you may have corruption
> at low levels of the system, and unless you have enormous redundancy
> (and actually use it to verify everything) the corruption may go 
> undetected and result in (subtly) wrong answers at any future time.
Nathan, why are you so hostile against postgres? Is there an ax to grind?

The conditions under which WAL will completely recover your database:
1) OS guarantees complete ordering of fsync()'d writes. (i.e. having two
blocks A and B, A is fsync'd before B, it could NOT happen that B is on
disk but A is not).
2) on boot recovery, OS must not corrupt anything that was fsync'd.

Rule 1) is met by all unixish OSes in existance. Rule 2 is met by some
filesystems, such as reiserfs, tux2, and softupdates. 

> The logging in 7.1 protects transactions against many sources of 
> database crash, but not necessarily against OS crash, and certainly
> not against power failure.  (You might get lucky, or you might just 
> think you were lucky.)  This is the same as for most databases; an
> embedded database that talks directly to the hardware might be able
> to do better.  



Re: beta testing version

From
"Vadim Mikheev"
Date:
> As for replaying logs against a restored snapshot dump... AIUI, a 
> dump records tuples by OID, but the WAL refers to TIDs.  Therefore, 
> the WAL won't work as a re-do log to recover your transactions 
> because the TIDs of the restored tables are all different.   

True for current way of backing up - ie saving data in "external"
(sql) format. But there is another way - saving data files in their
natural (binary) format. WAL records may be applyed to
such dump, right?

> To get replaying we need an "update log", something that might be

What did you mean by "update log"?
Are you sure that WAL is not "update log" ? -:)

> in 7.2 if somebody does a lot of work.
> 
> > Note I'm not saying you're wrong, just asking that you explain your
> > comment a little more. If WAL can't be used to help recover from
> > crashes where database corruption occurs, what good is it?
> 
> The WAL is a performance optimization for the current recovery
> capabilities, which assume uncorrupted table files.  It protects
> against those database server crashes that happen not to corrupt 
> the table files (i.e. most).  It doesn't protect against corruption 
> of the tables, by bugs in PG or in the OS or from "hardware events".  
> It also doesn't protect against OS crashes that result in 
> write-buffered sectors not having been written before the crash.  
> Practically, this means that WAL file entries older than a few 
> seconds are not useful for much.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Even now, without BAR, WAL entries become unuseful only after checkpoints
(and I wouldn't recomend to create them each few seconds -:)). WAL based
BAR will require archiving of log records.

Vadim




Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Fri, Dec 01, 2000 at 01:54:23AM -0500, Alex Pilosov wrote:
> On Thu, 30 Nov 2000, Nathan Myers wrote:
> > After a power outage on an active database, you may have corruption
> > at low levels of the system, and unless you have enormous redundancy
> > (and actually use it to verify everything) the corruption may go 
> > undetected and result in (subtly) wrong answers at any future time.
>
> Nathan, why are you so hostile against postgres? Is there an ax to grind?

Alex, please don't invent enemies.  It's clear what important features
PostgreSQL still lacks; over the next several releases these features
will be implemented, at great expense.  PostgreSQL is useful and usable
now, given reasonable precautions and expectations.  In the future it
will satisfy greater (albeit still reasonable) expectations.

> The conditions under which WAL will completely recover your database:
>
> 1) OS guarantees complete ordering of fsync()'d writes. (i.e. having two
> blocks A and B, A is fsync'd before B, it could NOT happen that B is on
> disk but A is not).
> 2) on boot recovery, OS must not corrupt anything that was fsync'd.
> 
> Rule 1) is met by all unixish OSes in existance. Rule 2 is met by some
> filesystems, such as reiserfs, tux2, and softupdates. 

No.  The OS asks the disk to write blocks in a certain order, but 
disks normally reorder writes.  Not only that; as noted earlier, 
typical disks report the write completed long before the blocks 
actually hit the disk.

A logging file system protects against the simpler forms of OS crash,
where the OS data-structure corruption is noticed before any more disk
writes are scheduled.  It can't (by itself) protect against disk
errors.  For critical applications, you must supply that protection
yourself, with (e.g.) battery-backed mirroring.

> > The logging in 7.1 protects transactions against many sources of 
> > database crash, but not necessarily against OS crash, and certainly
> > not against power failure.  (You might get lucky, or you might just 
> > think you were lucky.)  This is the same as for most databases; an
> > embedded database that talks directly to the hardware might be able
> > to do better.  

The best possible database code can't overcome a broken OS or a broken 
disk.  It would be unreasonable to expect otherwise.

Nathan Myers
ncm@zembu.com 


Re: beta testing version

From
Ian Lance Taylor
Date:
Date: Fri, 1 Dec 2000 01:54:23 -0500 (EST)  From: Alex Pilosov <alex@pilosoft.com>
  On Thu, 30 Nov 2000, Nathan Myers wrote:
  > After a power outage on an active database, you may have corruption  > at low levels of the system, and unless you
haveenormous redundancy  > (and actually use it to verify everything) the corruption may go   > undetected and result
in(subtly) wrong answers at any future time.  Nathan, why are you so hostile against postgres? Is there an ax to
grind?

I don't think he is being hostile (I work with him, so I know that he
is generally pro-postgres).
  The conditions under which WAL will completely recover your database:  1) OS guarantees complete ordering of
fsync()'dwrites. (i.e. having two  blocks A and B, A is fsync'd before B, it could NOT happen that B is on  disk but A
isnot).  2) on boot recovery, OS must not corrupt anything that was fsync'd.
 
  Rule 1) is met by all unixish OSes in existance. Rule 2 is met by some  filesystems, such as reiserfs, tux2, and
softupdates.
 

I think you are missing his main point, which he stated before, which
is that modern disk hardware is both smarter and stupider than most
people realize.

Some disks cleverly accept writes into a RAM cache, and return a
completion signal as soon as they have done that.  They then feel free
to reorder the writes to magnetic media as they see fit.  This
significantly helps performance.  However, it means that all bets off
on a sudden power loss.

Your rule 1 is met at the OS level, but it is not met at the physical
drive level.  The fact that the OS guarantees ordering of fsync()'d
writes means little since the drive is capable of reordering writes
behind the back of the OS.

At least with IDE, it is possible to tell the drive to disable this
sort of caching and reordering.  However, GNU/Linux, at least, does
not do this.  After all, doing it would hurt performance, and would
move us back to the old days when operating systems had to care a
great deal about disk geometry.

I expect that careful attention to the physical disks you purchase can
help you avoid these problems.  For example, I would hope that EMC
disk systems handle power loss gracefully.  But if you buy ordinary
off the shelf PC hardware, you really do need to arrange for a UPS,
and some sort of automatic shutdown if the UPS is running low.
Otherwise, although the odds are certainly with you, there is no 100%
guarantee that a busy database will survive a sudden power outage.

Ian


Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Thu, Nov 30, 2000 at 11:06:31PM -0800, Vadim Mikheev wrote:
> > As for replaying logs against a restored snapshot dump... AIUI, a 
> > dump records tuples by OID, but the WAL refers to TIDs.  Therefore, 
> > the WAL won't work as a re-do log to recover your transactions 
> > because the TIDs of the restored tables are all different.   
> 
> True for current way of backing up - ie saving data in "external"
> (sql) format. But there is another way - saving data files in their
> natural (binary) format. WAL records may be applyed to
> such dump, right?

But (AIUI) you can only safely/usefully copy those files when the 
database is shut down.

Many people hope to run PostgreSQL 24x7x365.  With vacuuming, you 
might just as well shut down afterward; but when that goes away 
(in 7.2?), when will you get the chance to take your backups?  
Clearly we need either another form of snapshot backup that can 
be taken with the database running, and compatible with the 
current WAL (or some variation on it); or, we need another kind 
of log, in addition to the WAL.

> > To get replaying we need an "update log", something that might be
> > in 7.2 if somebody does a lot of work.
> 
> What did you mean by "update log"?
> Are you sure that WAL is not "update log" ? -:)

No, I'm not sure.  I think it's possible that a new backup utility 
could be written to make a hot backup which could be restored and 
then replayed using the current WAL format.  It might be easier to
add another log which could be replayed against the existing form
of backups.  That last is what I called the "update log".

The point is, WAL now does one job superbly: maintain a consistent
on-disk database image.  Asking it to do something else, such as 
supporting hot BAR, could interfere with it doing its main job.  
Of course, only the person who implements hot BAR can say.

Nathan Myers
ncm@zembu.com


Re: beta testing version

From
Philip Warner
Date:
At 00:55 1/12/00 -0800, Nathan Myers wrote:
>On Thu, Nov 30, 2000 at 11:06:31PM -0800, Vadim Mikheev wrote:
>> > As for replaying logs against a restored snapshot dump... AIUI, a 
>> > dump records tuples by OID, but the WAL refers to TIDs.  Therefore, 
>> > the WAL won't work as a re-do log to recover your transactions 
>> > because the TIDs of the restored tables are all different.   
>> 
>> True for current way of backing up - ie saving data in "external"
>> (sql) format. But there is another way - saving data files in their
>> natural (binary) format. WAL records may be applyed to
>> such dump, right?
>
>But (AIUI) you can only safely/usefully copy those files when the 
>database is shut down.
>

This is not true; the way Vadim has implemeted WAL is to write a series of
files of fixed size. When all transactions that have records in one file
have completed, that file is (currently) deleted. When BAR is going, the
files will be archived.

The only circumstance in which this strategy will fail is if there are a
large number of intensive long-standing single transactions - which is
unlikely (not to mention bad practice).

As a result of this, BAR will just need to take a snapshot of the database
and apply the logs (basically like a very extended recovery process).

You have raised some interesting issues regrading write-order etc. Can we
assume that when fsync *returns*, all records are written - though not
necessarily in the order that the IO's were executed?





----------------------------------------------------------------
Philip Warner                    |     __---_____
Albatross Consulting Pty. Ltd.   |----/       -  \
(A.B.N. 75 008 659 498)          |          /(@)   ______---_
Tel: (+61) 0500 83 82 81         |                 _________  \
Fax: (+61) 0500 83 82 82         |                 ___________ |
Http://www.rhyme.com.au          |                /           \|                                |    --________--
PGP key available upon request,  |  /
and from pgp5.ai.mit.edu:11371   |/


Re: beta testing version

From
Don Baccus
Date:
At 12:55 AM 12/1/00 -0800, Nathan Myers wrote:

>Many people hope to run PostgreSQL 24x7x365.  With vacuuming, you 
>might just as well shut down afterward; but when that goes away 
>(in 7.2?), when will you get the chance to take your backups?  
>Clearly we need either another form of snapshot backup that can 
>be taken with the database running, and compatible with the 
>current WAL (or some variation on it); or, we need another kind 
>of log, in addition to the WAL.

Vadim's not ignorant of such matters, when he says "make a copy
of the files" he's not talking about using tar on a running
database.  BAR tools are needed, as Vadim has pointed out here in
the past.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
Don Baccus
Date:
At 11:06 PM 11/30/00 -0800, Vadim Mikheev wrote:
>> As for replaying logs against a restored snapshot dump... AIUI, a 
>> dump records tuples by OID, but the WAL refers to TIDs.  Therefore, 
>> the WAL won't work as a re-do log to recover your transactions 
>> because the TIDs of the restored tables are all different.   
>
>True for current way of backing up - ie saving data in "external"
>(sql) format. But there is another way - saving data files in their
>natural (binary) format. WAL records may be applyed to
>such dump, right?

Right.  That's what's missing in PG 7.1, the existence of tools to
make such backups.  

Probably the best answer to the "what does WAL get us, if it doesn't
get us full recoverability" questions is to simply say "it's a prerequisite
to getting full recoverability, PG 7.1 sets the foundation and later
work will get us there".



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
Don Baccus
Date:
At 12:30 AM 12/1/00 -0800, Ian Lance Taylor wrote:
>For example, I would hope that EMC
>disk systems handle power loss gracefully.

They must, their marketing literature says so :)

>  But if you buy ordinary
>off the shelf PC hardware, you really do need to arrange for a UPS,
>and some sort of automatic shutdown if the UPS is running low.

Which is what disk subsystems like those from EMC do for you.  They've
got build-in battery backup that lets them guarantee (assuming the
hardware's working right) that in the case of a power outage, all blocks
the operating system thinks have been written will in actuality be written
before the disk subsystem powers itself down.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
"Vadim Mikheev"
Date:
> > > As for replaying logs against a restored snapshot dump... AIUI, a 
> > > dump records tuples by OID, but the WAL refers to TIDs.  Therefore, 
> > > the WAL won't work as a re-do log to recover your transactions 
> > > because the TIDs of the restored tables are all different.   
> > 
> > True for current way of backing up - ie saving data in "external"
> > (sql) format. But there is another way - saving data files in their
> > natural (binary) format. WAL records may be applyed to
> > such dump, right?
> 
> But (AIUI) you can only safely/usefully copy those files when the 
> database is shut down.

No. You can read/save datafiles at any time. But block reads must be
"atomic" - no one should be able to change any part of a block while
we read it. Cp & tar are probably not suitable for this, but internal
BACKUP command could do this.

Restoring from such backup will like recovering after pg_ctl -m i stop: all
data blocks are consistent and WAL records may be applyed to them.

> Many people hope to run PostgreSQL 24x7x365.  With vacuuming, you 
> might just as well shut down afterward; but when that goes away 
> (in 7.2?), when will you get the chance to take your backups?  

Ability to shutdown 7.2 will be preserved -:))
But it's not required for backup.

> > > To get replaying we need an "update log", something that might be
> > > in 7.2 if somebody does a lot of work.
> > 
> > What did you mean by "update log"?
> > Are you sure that WAL is not "update log" ? -:)
> 
> No, I'm not sure.  I think it's possible that a new backup utility 
> could be written to make a hot backup which could be restored and 
> then replayed using the current WAL format.  It might be easier to
> add another log which could be replayed against the existing form
> of backups.  That last is what I called the "update log".

Consistent read of data blocks is easier to implement, sure.

> The point is, WAL now does one job superbly: maintain a consistent
> on-disk database image.  Asking it to do something else, such as 
> supporting hot BAR, could interfere with it doing its main job.  
> Of course, only the person who implements hot BAR can say.

There will be no interference because of BAR will not ask WAL to do
anything else it does right now - redo-ing changes.

Vadim




Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Fri, Dec 01, 2000 at 06:39:57AM -0800, Don Baccus wrote:
> 
> Probably the best answer to the "what does WAL get us, if it doesn't
> get us full recoverability" questions is to simply say "it's a 
> prerequisite to getting full recoverability, PG 7.1 sets the foundation 
> and later work will get us there".

Not to quibble, but for most of us, the answer to Don's question is:
"It gives a ~20x speedup over 7.0."  That's pretty valuable to some of us.
If it turns out to be useful for other stuff, that's gravy.

Nathan Myers
ncm@zembu.com


Re: beta testing version

From
Don Baccus
Date:
At 11:02 AM 12/1/00 -0800, Nathan Myers wrote:
>On Fri, Dec 01, 2000 at 06:39:57AM -0800, Don Baccus wrote:
>> 
>> Probably the best answer to the "what does WAL get us, if it doesn't
>> get us full recoverability" questions is to simply say "it's a 
>> prerequisite to getting full recoverability, PG 7.1 sets the foundation 
>> and later work will get us there".
>
>Not to quibble, but for most of us, the answer to Don's question is:
>"It gives a ~20x speedup over 7.0."  That's pretty valuable to some of us.
>If it turns out to be useful for other stuff, that's gravy.

Oh, but given that power failures eat disks anyway, you can just run PG 7.0
with -F and be just as fast as PG 7.1, eh?  With no theoretical loss in
safety?  Where's your faith in all that doom and gloom you've been 
spreading? :) :)

You're right, of course, we'll get roughly -F performance while maintaining
a much more comfortable level of risk than you get with -F.



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


WAL information

From
"Mitch Vincent"
Date:
Ok, this has peaked my interest in learning exactly what WAL is and what it
does... I don't see any in-depth explanation of WAL on the postgresql.org
site, can someone point me to some documentation? (if any exists, that is).

Thanks!

-Mitch

----- Original Message -----
From: "Nathan Myers" <ncm@zembu.com>
To: <pgsql-hackers@postgresql.org>
Sent: Friday, December 01, 2000 11:02 AM
Subject: Re: [HACKERS] beta testing version


> On Fri, Dec 01, 2000 at 06:39:57AM -0800, Don Baccus wrote:
> >
> > Probably the best answer to the "what does WAL get us, if it doesn't
> > get us full recoverability" questions is to simply say "it's a
> > prerequisite to getting full recoverability, PG 7.1 sets the foundation
> > and later work will get us there".
>
> Not to quibble, but for most of us, the answer to Don's question is:
> "It gives a ~20x speedup over 7.0."  That's pretty valuable to some of us.
> If it turns out to be useful for other stuff, that's gravy.
>
> Nathan Myers
> ncm@zembu.com
>



Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Fri, Dec 01, 2000 at 09:13:28PM +1100, Philip Warner wrote:
> 
> You have raised some interesting issues regrading write-order etc. Can we
> assume that when fsync *returns*, all records are written - though not
> necessarily in the order that the IO's were executed?

Not with ordinary disks.  With a battery-backed disk server, yes.

Nathan Myers
ncm@zembu.com



Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Fri, Dec 01, 2000 at 08:10:40AM -0800, Vadim Mikheev wrote:
> 
> > ... a new backup utility 
> > could be written to make a hot backup which could be restored and 
> > then replayed using the current WAL format.  It might be easier to
> > add another log which could be replayed against the existing form
> > of backups.  That last is what I called the "update log".
> 
> Consistent read of data blocks is easier to implement, sure.
> 
> > The point is, WAL now does one job superbly: maintain a consistent
> > on-disk database image.  Asking it to do something else, such as 
> > supporting hot BAR, could interfere with it doing its main job.  
> > Of course, only the person who implements hot BAR can say.
> 
> There will be no interference because of BAR will not ask WAL to do
> anything else it does right now - redo-ing changes.

The interference I meant is that the current WAL file format is designed 
for its current job.  For BAR, you would be better-served by a more 
compact format, so you need not archive your logs so frequently.  
(The size of the WAL doesn't matter much because you can rotate them 
very quickly.)  A more compact format is also better as a basis for 
replication, to minimize network traffic.  To compress the WAL would 
hurt performance -- but adding performance was the point of the WAL.

A log encoded at a much higher semantic level could be much more 
compact, but wouldn't be useful as a WAL because it describes 
differences from a snapshot backup, not from the current table 
file contents.

Thus, I'm not saying that you can't implement both WAL and hot BAR
using the same log; rather, it's just not _obviously_ the best way to 
do it.  

Nathan Myers
ncm@zembu.com


Re: beta testing version

From
"Magnus Naeslund\(f\)"
Date:
From: "Nathan Myers" <ncm@zembu.com>
> On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote:
> >
[snip]
> The logging in 7.1 protects transactions against many sources of
> database crash, but not necessarily against OS crash, and certainly
> not against power failure.  (You might get lucky, or you might just
> think you were lucky.)  This is the same as for most databases; an
> embedded database that talks directly to the hardware might be able
> to do better.
>

If PG had a type of tree based logging filesystem, that it self handles,
wouldn't that be almost perfectly safe? I mean that you might lose some data
in an transaction, but the client never gets an OK anyways...
Like a combination of raw block io and tux2 like fs.
Doesn't Oracle do it's own block io, no?

Magnus

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-Programmer/Networker [|] Magnus NaeslundPGP Key:
http://www.genline.nu/mag_pgp.txt
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-






Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Fri, Dec 01, 2000 at 12:00:12AM -0400, The Hermit Hacker wrote:
> On Thu, 30 Nov 2000, Nathan Myers wrote:
> > On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote:
> > > v7.1 should improve crash recovery ...
> > > ... with the WAL stuff that Vadim is producing, you'll be able to
> > > recover up until the point that the power cable was pulled out of 
> > > the wall.
> > 
> > Please do not propagate falsehoods like the above.  It creates
> > unsatisfiable expectations, and leads people to fail to take
> > proper precautions and recovery procedures.  
> > 
> > After a power outage on an active database, you may have corruption
> > at low levels of the system, and unless you have enormous redundancy
> > (and actually use it to verify everything) the corruption may go 
> > undetected and result in (subtly) wrong answers at any future time.
> > 
> > The logging in 7.1 protects transactions against many sources of 
> > database crash, but not necessarily against OS crash, and certainly
> > not against power failure.  (You might get lucky, or you might just 
> > think you were lucky.)  This is the same as for most databases; an
> > embedded database that talks directly to the hardware might be able
> > to do better.  
> 
> We're talking about transaction logging here ... nothing gets written
> to it until completed ... if I take a "known to be clean" backup from
> the night before, restore that and then run through the transaction
> logs, my data should be clean, unless my tape itself is corrupt. If
> the power goes off half way through a write to the log, then that
> transaction wouldn't be marked as completed and won't roll into the
> restore ...

Sorry, wrong.  First, the only way that your backups could have any
relationship with the transaction logs is if they are copies of the
raw table files with the database shut down, rather than the normal 
"snapshot" backup.  

Second, the transaction log is not, as has been noted far too frequently
for Vince's comfort, really written atomically.  The OS has promised
to write it atomically, and given the opportunity, it will.  If you pull 
the plug, all promises are broken.

> if a disk goes corrupt, I'd expect that the redo log would possibly
> have a problem with corruption .. but if I pull the plug, unless I've
> somehow damaged the disk, I would expect my redo log to be clean
> *and*, unless Vadim totally messed something up, if there is any
> corruption in the redo log, I'd expect that restoring from it would
> generate from red flags ...

You have great expectations, but nobody has done the work to satisfy
them, so when you pull the plug, I'd expect that you will be left 
in the dark, alone and helpless.

Vadim has done an excellent job on what he set out to do: optimize
transaction processing.  Designing and implementing a factor-of-twenty 
speed improvement on a professional-quality database engine demanded
great effort and expertise.  To complain that he hasn't also done 
a lot of other stuff would be petty.

Nathan Myers
ncm@zembu.com



Re: beta testing version

From
Tom Lane
Date:
ncm@zembu.com (Nathan Myers) writes:
> On Fri, Dec 01, 2000 at 09:13:28PM +1100, Philip Warner wrote:
>> You have raised some interesting issues regrading write-order etc. Can we
>> assume that when fsync *returns*, all records are written - though not
>> necessarily in the order that the IO's were executed?

> Not with ordinary disks.  With a battery-backed disk server, yes.

I think the real point of this discussion is that there's no such thing
as an ironclad guarantee.  That's why people make backups.

All we can do is the best we can ;-).  In that light, I think it's
reasonable for Postgres to proceed on the assumption that fsync does
what it claims to do, ie, all blocks are written when it returns.
We can't realistically expect to persuade a disk controller that
reorders writes to stop doing so.  We can, however, expect that we've
minimized the probability of failures induced by anything other than
disk hardware failure or power failure.
        regards, tom lane


Re: beta testing version

From
Don Baccus
Date:
At 03:35 PM 11/30/00 -0800, Nathan Myers wrote:
>On Thu, Nov 30, 2000 at 07:02:01PM -0400, The Hermit Hacker wrote:
>> 
>> v7.1 should improve crash recovery ...
>> ... with the WAL stuff that Vadim is producing, you'll be able to
>> recover up until the point that the power cable was pulled out of 
>> the wall.
>
>Please do not propagate falsehoods like the above.  It creates
>unsatisfiable expectations, and leads people to fail to take
>proper precautions and recovery procedures.  

Yeah, I posted similar stuff to the PHPbuilder forum in regard to
PG.

>The logging in 7.1 protects transactions against many sources of 
>database crash, but not necessarily against OS crash, and certainly
>not against power failure.  (You might get lucky, or you might just 
>think you were lucky.)  This is the same as for most databases; an
>embedded database that talks directly to the hardware might be able
>to do better.  

Let's put it this way ... Oracle, a transaction-safe DB with REDO
logging, has for a very long time implemented disk mirroring.  Now,
why would they do that if you could pull the plug on the processor
and depend on REDO logging to save you?

And even then you're expected to provide adequate power backup to
enable clean shutdown.

The real safety you get is that your battery sez "we need to shut
down!" but has enough power to let you.  Transactions in progress
aren't logged, but everything else can tank cleanly, and your DB is
in a consistent state.  

Mirroring protects you against (some) disk drive failures (but not
those that are transparent to the RAID controller/driver - if your
drive writes crap to the primary side of the mirror and no errors
are returned to the hardware/driver, the other side of the mirror
can faithfully reproduce them on the mirror!)

But since drives contain bearings and such that are much more likely
to fail than electronics (good electronics and good designs, at least),
mechanical failure's more likely and will be known to whatever is driving
the drive.  And you're OK then...



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.
 


Re: beta testing version

From
Vince Vielhaber
Date:
On Thu, 30 Nov 2000, Nathan Myers wrote:

> Second, the transaction log is not, as has been noted far too frequently
> for Vince's comfort, really written atomically.  The OS has promised
> to write it atomically, and given the opportunity, it will.  If you pull
> the plug, all promises are broken.

Say what?

Vince.
-- 
==========================================================================
Vince Vielhaber -- KA8CSH    email: vev@michvhf.com    http://www.pop4.net128K ISDN from $22.00/mo - 56K Dialup from
$16.00/moat Pop4 Networking       Online Campground Directory    http://www.camping-usa.com      Online Giftshop
Superstore   http://www.cloudninegifts.com
 
==========================================================================





Re: beta testing version

From
Don Baccus
Date:
At 02:47 PM 12/1/00 -0500, Tom Lane wrote:

>All we can do is the best we can ;-).  In that light, I think it's
>reasonable for Postgres to proceed on the assumption that fsync does
>what it claims to do, ie, all blocks are written when it returns.
>We can't realistically expect to persuade a disk controller that
>reorders writes to stop doing so.  We can, however, expect that we've
>minimized the probability of failures induced by anything other than
>disk hardware failure or power failure.

Right.  This is very much the guarantee that RAID (non-zero) makes, 
except "other than disk hardware failure" is replaced by "other than
the failure of two drives".  RAID gives you that (very, very substantial
boost which is why it is so popular for DB servers).  It doesn't give
you power failure assurance for much the same reason that PG (or Oracle,
etc) can.

If transaction processing alone could give you protection against a 
single disk hardware failure, Oracle wouldn't've bothered implementing
mirroring in the past before software (and even reasonable hardware)
RAID was available.

Likewise, if mirroring + transaction processing could protect against
disks hosing themselves in power failure situations Oracle wouldn't 
suggest that enterprise level customers invest in external disk
subsystems with battery backup sufficient to guarantee everything
the db server believes has been written really is written.

Of course, Oracle license fees are high enough that proper hardware
support tends to look cheap in comparison...

Vadim's WAL code is excellent, and the fact that we run in essence
with -F performance and also less write activity to the disk both
increases performance, and tends to lessen the probability that the
disk will actually be writing a block when the power goes off.  The
dice aren't quite so loaded against the server with this lowered
disk activity...



- Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert
Serviceand other goodies at http://donb.photo.net.