Thread: Re: beta testing version

Re: beta testing version

From
mlw
Date:
Peter Eisentraut wrote:
> 
> mlw writes:
> 
> > There are hundreds (thousands?) of people that have contributed to the
> > development of Postgres, either directly with code, or beta testing,
> > with the assumption that they are benefiting a community. Many would
> > probably not have done so if they had suspected that what they do is
> > used in a product that excludes them.
> 
> With the BSD license it has always been clear that this would be possible,
> and for as long as I've been around the core/active developers have
> frequently reiterated that this is a desirable aspect and in fact
> encouraged.  If you don't like that, then you should have read the license
> before using the product.
> 
> > I have said before, open source is a social contract, not a business
> > model.
> 
> Well, you're free to take the PostgreSQL source and start your own "social
> contract" project; but we don't do that around here.

And you don't feel that this is a misappropriation of a public trust? I
feel shame for you.

-- 
http://www.mohawksoft.com


Re: beta testing version

From
"xuyifeng"
Date:
Hi,
   how long is PG7.1 already in beta testing? can it be released before Christmas day?   can PG7.1  will recover
databasefrom system crash?   Thanks,XuYifeng
 


Re: beta testing version

From
Michael Fork
Date:
Judging by the information below, taken *directly* from PostgreSQL, Inc.
website, it appears that they will be releasing all code into the main
source code branch -- with the exception of "Advanced Replication and
Distributed Information capabilities" (to which capabilities they are
referring is not made clear) which may remain proprietary for up to 24
months "in order to assist us in recovering development costs and continue
to provide funding for our other Open Source contributions."

I have interpreted this to mean that basic replication (server -> server,
server -> client, possibly more)  will be available shortly for Postgres
(with the release of 7.1?) and that those more advanced features will
follow behind.  This is one of the last features that was missing from
Postgres (along with recordset returning functions and clusters, among
others) that was holding it back from the enterprise market -- and I do
not blame PostgreSQL, Inc. one bit for withholding some of the more
advanced features to recoup their development costs -- it was *their time*
and *their money* they spent developing the *product* and it must be
recoup'ed for projects like this to make sense in the future (who knows,
maybe next they will implement RS returning SP's or clusters, projects
that are funded with their profit off the advanced replication and
distributed information capabilities that they *may* withhold -- would
people still be whining then?)

Michael Fork - CCNA - MCP - A+ 
Network Support - Toledo Internet Access - Toledo Ohio

(http://www.pgsql.com/press/PR_5.html)
"At the moment we are limiting our test groups to our existing Platinum
Partners and those clients whose requirements include these
features." advises Jeff MacDonald, VP of Support Services. "We expect to
have the source code tested and ready to contribute to the open source
community before the middle of October. Until that time we are considering
requests from a number of development companies and venture capital groups
to join us in this process."

Davidson explains, "These initial Replication functions are important to
almost every commercial user of PostgreSQL. While we've fully funded all
of this development ourselves, we will be immediately donating these
capabilities to the open source PostgreSQL Global Development Project as
part of our ongoing commitment to the PostgreSQL community." 

http://www.erserver.com/
eRServer development is currently concentrating on core, universal
functions that will enable individuals and IT professionals to implement
PostgreSQL ORDBMS solutions for mission critical datawarehousing,
datamining, and eCommerce requirements. These initial developments will be
published under the PostgreSQL Open Source license, and made available
through our sites, Certified Platinum Partners, and others in PostgreSQL
community.

Advanced Replication and Distributed Information capabilities are also
under development to meet specific business and competitive requirements
for both PostgreSQL, Inc. and clients. Several of these enhanced
PostgreSQL, Inc. developments may remain proprietary for up to 24 months,
with availability limited to clients and partners, in order to assist us
in recovering development costs and continue to provide funding for our
other Open Source contributions. 

On Sun, 3 Dec 2000, Hannu Krosing wrote:

> The Hermit Hacker wrote:
> IIRC, this thread woke up on someone complaining about PostgreSQl inc
> promising 
> to release some code for replication in mid-october and asking for
> confirmation 
> that this is just a schedule slip and that the project is still going on
> and 
> going to be released as open source.
> 
> What seems to be the answer is: "NO, we will keep the replication code
> proprietary".
> 
> I have not seen this answer myself, but i've got this impression from
> the contents 
> of the whole discussion.
> 
> Do you know if this is the case ?
> 
> -----------
> Hannu
> 










RE: beta testing version

From
"Mikheev, Vadim"
Date:
> As far as I know (and have tested in excess) Informix IDS 
> does survive any power loss without leaving the db in a
> corrupted state. The basic technology is, that it only relys
> on writes to one "file" (raw device in that case), the txlog,
> which is directly written. All writes to the txlog are basically
> appends to that log. Meaning that all writes are sync writes to
> the currently active (== last) page. All other IO is not a problem,
> because a backup image "physical log" is kept for each page 
> that needs to be written. During fast recovery the content of the
> physical log is restored to the originating pages (thus all pendig
> IO is undone) before rollforward is started.

Sounds great! We can follow this way: when first after last checkpoint
update to a page being logged, XLOG code can log not AM specific update
record but entire page (creating backup "physical log"). During after
crash recovery such pages will be redone first, ensuring page consistency
for further redo ops. This means bigger log, of course.

Initdb will not be required for these code changes, so it can be
implemented in any 7.1.X, X >=1.

Thanks, Andreas!

Vadim


RE: beta testing version

From
"Mikheev, Vadim"
Date:
> > I totaly missed your point here. How closing source of 
> > ERserver is related to closing code of PostgreSQL DB server?
> > Let me clear things:
> >
> > 1. ERserver isn't based on WAL. It will work with any version >= 6.5
> >
> > 2. WAL was partially sponsored by my employer, Sectorbase.com,
> > not by PG, Inc.
> 
> Has somebody thought about putting PG in the GPL licence 
> instead of the BSD? 
> PG inc would still be able to do there money giving support 
> (just like IBM, HP and Compaq are doing there share with Linux),
> without been able to close the code.

ERserver is *external* application that change *nothing* in
PostgreSQL code. So, no matter under what licence are
server code, any company will be able to close code of
any privately developed *external* application.
And I don't see what's wrong with this, do you?

Vadim


Re: beta testing version

From
Alfred Perlstein
Date:
> > > I totaly missed your point here. How closing source of 
> > > ERserver is related to closing code of PostgreSQL DB server?
> > > Let me clear things:
> > >
> > > 1. ERserver isn't based on WAL. It will work with any version >= 6.5
> > >
> > > 2. WAL was partially sponsored by my employer, Sectorbase.com,
> > > not by PG, Inc.
> > 
> > Has somebody thought about putting PG in the GPL licence 
> > instead of the BSD? 
> > PG inc would still be able to do there money giving support 
> > (just like IBM, HP and Compaq are doing there share with Linux),
> > without been able to close the code.

This gets brought up every couple of months, I don't see the point
in denying any of the current Postgresql developers the chance
to make some money selling a non-freeware version of Postgresql.

We can also look at it another way, let's say ER server was meant
to be closed source, if the code it was derived from was GPL'd
then that chance was gone before it even happened.  Hence no
reason to develop it.

*poof* no ER server.

-- 
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Tue, Dec 05, 2000 at 10:43:03AM -0800, Mikheev, Vadim wrote:
> > As far as I know (and have tested in excess) Informix IDS 
> > does survive any power loss without leaving the db in a
> > corrupted state. The basic technology is, that it only relys
> > on writes to one "file" (raw device in that case), the txlog,
> > which is directly written. All writes to the txlog are basically
> > appends to that log. Meaning that all writes are sync writes to
> > the currently active (== last) page. All other IO is not a problem,
> > because a backup image "physical log" is kept for each page 
> > that needs to be written. During fast recovery the content of the
> > physical log is restored to the originating pages (thus all pendig
> > IO is undone) before rollforward is started.
> 
> Sounds great! We can follow this way: when first after last checkpoint
> update to a page being logged, XLOG code can log not AM specific update
> record but entire page (creating backup "physical log"). During after
> crash recovery such pages will be redone first, ensuring page consistency
> for further redo ops. This means bigger log, of course.
Be sure to include a CRC of each part of the block that you hope
to replay individually.

Nathan Myers
ncm@zembu.com


RE: beta testing version

From
"Mikheev, Vadim"
Date:
> > Sounds great! We can follow this way: when first after last 
> > checkpoint update to a page being logged, XLOG code can log
> > not AM specific update record but entire page (creating backup
> > "physical log"). During after crash recovery such pages will
> > be redone first, ensuring page consistency for further redo ops.
> > This means bigger log, of course.
>  
> Be sure to include a CRC of each part of the block that you hope
> to replay individually.

Why should we do this? I'm not going to replay parts individually,
I'm going to write entire pages to OS cache and than apply changes to
them. Recovery is considered as succeeded after server is ensured
that all applyed changes are on the disk. In the case of crash during
recovery we'll replay entire game.

Vadim


RE: beta testing version

From
"Mikheev, Vadim"
Date:
> > > > Sounds great! We can follow this way: when first after last 
> > > > checkpoint update to a page being logged, XLOG code can log
> > > > not AM specific update record but entire page (creating backup
> > > > "physical log"). During after crash recovery such pages will
> > > > be redone first, ensuring page consistency for further redo ops.
> > > > This means bigger log, of course.
> > >  
> > > Be sure to include a CRC of each part of the block that you hope
> > > to replay individually.
> > 
> > Why should we do this? I'm not going to replay parts individually,
> > I'm going to write entire pages to OS cache and than apply 
> > changes to them. Recovery is considered as succeeded after server
> > is ensured that all applyed changes are on the disk. In the case of
> > crash during recovery we'll replay entire game.
> 
> Yes, but there would need to be a way to verify the last page 
> or record from txlog when running on crap hardware. The point was,
> that crap hardware writes our 8k pages in any order (e.g. 512 bytes
> from the end, then 512 bytes from front ...), and does not
> even notice, that it only wrote part of one such 512 byte block when
> reading it back after a crash. But, I actually doubt that this is
> true for all but the most crappy hardware.

Oh, I didn't consider log consistency that time. Anyway we need in CRC
for entire log record not for its 512-bytes parts.

Well, I didn't care about not atomic 8K-block writes in current WAL
implementation - we never were protected from this: backend inserts
tuple, but only line pointers go to disk => new lp points on some
garbade inside unupdated page content. Yes, transaction was not
committed but who knows content of this garbade and what we'll get
from scan trying to read it. Same for index pages.

Can we come to agreement about CRC in log records? Probably it's
not too late to add it (initdb).

Seeing bad CRC recovery procedure will assume that current record
(and all others after it, if any) is garbade - ie comes from
interrupted disk write - and may be ignored (backend writes data
pages only after changes are logged - if changes weren't
successfully logged then on-disk image of data pages was not
updated and we are not interested in log records).

This may be implemented very fast (if someone points me where
I can find CRC func). And I could implement "physical log"
till next monday.

Comments?

Vadim


Re: beta testing version

From
Tom Lane
Date:
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
> This may be implemented very fast (if someone points me where
> I can find CRC func).

Lifted from the PNG spec (RFC 2083):


15. Appendix: Sample CRC Code
  The following sample code represents a practical implementation of  the CRC (Cyclic Redundancy Check) employed in PNG
chunks. (See also  ISO 3309 [ISO-3309] or ITU-T V.42 [ITU-V42] for a formal  specification.)
 

     /* Make the table for a fast CRC. */     void make_crc_table(void)     {       unsigned long c;       int n, k;
  for (n = 0; n < 256; n++) {         c = (unsigned long) n;         for (k = 0; k < 8; k++) {           if (c & 1)
       c = 0xedb88320L ^ (c >> 1);           else             c = c >> 1;         }         crc_table[n] = c;       }
   crc_table_computed = 1;     }
 
     /* Update a running CRC with the bytes buf[0..len-1]--the CRC        should be initialized to all 1's, and the
transmittedvalue        is the 1's complement of the final running CRC (see the        crc() routine below)). */
 
     unsigned long update_crc(unsigned long crc, unsigned char *buf,                              int len)     {
unsignedlong c = crc;       int n;
 
       if (!crc_table_computed)         make_crc_table();       for (n = 0; n < len; n++) {         c = crc_table[(c ^
buf[n])& 0xff] ^ (c >> 8);       }       return c;     }
 
     /* Return the CRC of the bytes buf[0..len-1]. */     unsigned long crc(unsigned char *buf, int len)     {
returnupdate_crc(0xffffffffL, buf, len) ^ 0xffffffffL;     }
 

        regards, tom lane


Re: beta testing version

From
Tom Lane
Date:
> Lifted from the PNG spec (RFC 2083):

Drat, I dropped the table declarations:
     /* Table of CRCs of all 8-bit messages. */     unsigned long crc_table[256];
     /* Flag: has the table been computed? Initially false. */     int crc_table_computed = 0;

        regards, tom lane


CRC was: Re: beta testing version

From
"Horst Herb"
Date:
> This may be implemented very fast (if someone points me where
> I can find CRC func). And I could implement "physical log"
> till next monday.

I have been experimenting with CRCs for the past 6 month in our database for
internal logging purposes. Downloaded a lot of hash libraries, tried
different algorithms, and implemented a few myself. Which algorithm do you
want? Have a look at the openssl libraries (www.openssl.org) for a start -if
you don't find what you want let me know.

As the logging might include large data blocks, especially now that we can
TOAST our data, I would strongly suggest to use strong hashes like RIPEMD or
MD5 instead of CRC-32 and the like. Sure, it takes more time tocalculate and
more place on the hard disk, but then: a database without data integrity
(and means of _proofing_ integrity) is pretty worthless.

Horst



pre-beta is slow

From
"xuyifeng"
Date:
recently  I  have downloaded a pre-beta postgresql,  I found  insert and update speed is slower then 7.0.3,
even I turn of sync flag,  it is still slow than 7.0, why? how can I make it faster?

Regards,
XuYifeng



Re: CRC was: Re: beta testing version

From
Hannu Krosing
Date:
Horst Herb wrote:
> 
> > This may be implemented very fast (if someone points me where
> > I can find CRC func). And I could implement "physical log"
> > till next monday.
> 
> I have been experimenting with CRCs for the past 6 month in our database for
> internal logging purposes. Downloaded a lot of hash libraries, tried
> different algorithms, and implemented a few myself. Which algorithm do you
> want? Have a look at the openssl libraries (www.openssl.org) for a start -if
> you don't find what you want let me know.
> 
> As the logging might include large data blocks, especially now that we can
> TOAST our data, I would strongly suggest to use strong hashes like RIPEMD or
> MD5 instead of CRC-32 and the like. Sure, it takes more time tocalculate and
> more place on the hard disk, but then: a database without data integrity
> (and means of _proofing_ integrity) is pretty worthless.

The choice of hash algoritm could be made a compile-time switch quite
easyly I guess.

---------
Hannu


Re: CRC was: Re: beta testing version

From
ncm@zembu.com (Nathan Myers)
Date:
On Thu, Dec 07, 2000 at 06:40:49PM +1100, Horst Herb wrote:
> > This may be implemented very fast (if someone points me where
> > I can find CRC func). And I could implement "physical log"
> > till next monday.
> 
> As the logging might include large data blocks, especially now that
> we can TOAST our data, I would strongly suggest to use strong hashes
> like RIPEMD or MD5 instead of CRC-32 and the like. 

Cryptographically-secure hashes are unnecessarily expensive to compute.
A simple 64-bit CRC would be of equal value, at much less expense.

Nathan Myers
ncm@zembu.com



RE: beta testing version

From
"Mikheev, Vadim"
Date:
> > This may be implemented very fast (if someone points me where
> > I can find CRC func).
> 
> Lifted from the PNG spec (RFC 2083):

Thanks! What about Copyrights/licence?

Vadim


Re: beta testing version

From
Tom Lane
Date:
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
>>>> This may be implemented very fast (if someone points me where
>>>> I can find CRC func).
>> 
>> Lifted from the PNG spec (RFC 2083):

> Thanks! What about Copyrights/licence?

Should fit fine under our regular BSD license.  CRC as such is long
since in the public domain...
        regards, tom lane